english/machine_translation.md
Machine translation is the task of translating a sentence in a source language to a different target language.
Results with a * indicate that the mean test score over the the best window based on average dev-set BLEU score over 21 consecutive evaluations is reported as in Chen et al. (2018).
Models are evaluated on the English-German dataset of the Ninth Workshop on Statistical Machine Translation (WMT 2014) based on BLEU.
Similarly, models are evaluated on the English-French dataset of the Ninth Workshop on Statistical Machine Translation (WMT 2014) based on BLEU.
| Model | BLEU | Paper / Source |
|---|---|---|
| DeepL | 45.9 | DeepL Press release |
| Transformer Big + BT (Edunov et al., 2018) | 45.6 | Understanding Back-Translation at Scale |
| Admin (Liu et al., 2020) | 43.8 | Understand the Difficulty of Training Transformers |
| MUSE (Zhao et al., 2019) | 43.5 | MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning |
| TaLK Convolutions (Lioutas et al., 2020) | 43.2 | Time-aware Large Kernel Convolutions |
| DynamicConv (Wu et al., 2019) | 43.2 | Pay Less Attention With Lightweight and Dynamic Convolutions |
| Transformer Big (Ott et al., 2018) | 43.2 | Scaling Neural Machine Translation |
| RNMT+ (Chen et al., 2018) | 41.0* | The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation |
| Transformer Big (Vaswani et al., 2017) | 41.0 | Attention Is All You Need |
| MoE (Shazeer et al., 2017) | 40.56 | Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer |
| ConvS2S (Gehring et al., 2017) | 40.46 | Convolutional Sequence to Sequence Learning |
| Transformer Base (Vaswani et al., 2017) | 38.1 | Attention Is All You Need |
| Model | BLEU | Paper / Source |
|---|---|---|
| vanilla MNMT models | 17.95 | Tencent’s Multilingual Machine Translation System for WMT22 Large-Scale African Languages |