infoxlm/fairseq/README.md
Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.
Fairseq provides reference implementations of various sequence-to-sequence models, including:
Additionally:
We also provide pre-trained models for translation and language modeling
with a convenient torch.hub interface:
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'
See the PyTorch Hub tutorials for translation and RoBERTa for more examples.
--cuda_ext optionTo install fairseq:
pip install fairseq
On MacOS:
CFLAGS="-stdlib=libc++" pip install fairseq
If you use Docker make sure to increase the shared memory size either with
--ipc=host or --shm-size as command line options to nvidia-docker run.
Installing from source
To install fairseq from source and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable .
The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.
We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.
We also have more detailed READMEs to reproduce results from specific papers:
fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.
Please cite as:
@inproceedings{ott2019fairseq,
title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
year = {2019},
}