examples/community/roberta/README.md
This example introduce how to pretrain roberta from scratch, including preprocessing, pretraining, finetune. The example can help you quickly train a high-quality roberta.
/etc/ssh/sshd_config and /etc/ssh/ssh_config, every host expose the same ssh port of server and client. If you are a root user, you also set the PermitRootLogin from /etc/ssh/sshd_config to "yes"ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub ip_destination
192.168.2.1 GPU001
192.168.2.2 GPU002
192.168.2.3 GPU003
192.168.2.4 GPU004
192.168.2.5 GPU005
192.168.2.6 GPU006
192.168.2.7 GPU007
...
service ssh restart
cd preprocessing
following the README.md, preprocess original corpus to h5py plus numpy
cd pretraining
following the README.md, load the h5py generated by preprocess of step 1 to pretrain the model
The checkpoint produced by this repo can replace pytorch_model.bin from hfl/chinese-roberta-wwm-ext-large directly. Then use transformers from Hugging Face to finetune downstream application.
The example is contributed by AI team from Moore Threads. If you find any problems for pretraining, please file an issue or send an email to [email protected]. At last, welcome any form of contribution!