This model was released on 2019-10-23 and added to Hugging Face Transformers on 2020-11-16.

</div>

</div>

T5

T5 is a encoder-decoder transformer available in a range of sizes from 60M to 11B parameters. It is designed to handle a wide range of NLP tasks by treating them all as text-to-text problems. This eliminates the need for task-specific architectures because T5 converts every NLP task into a text generation task.

To formulate every task as text generation, each task is prepended with a task-specific prefix (e.g., translate English to German: ..., summarize: ...). This enables T5 to handle tasks like translation, summarization, question answering, and more.

You can find all official T5 checkpoints under the T5 collection.

[!TIP] Click on the T5 models in the right sidebar for more examples of how to apply T5 to different language tasks.

The example below demonstrates how to generate text with [Pipeline], [AutoModel], and how to translate with T5 from the command line.

python

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer


tokenizer = AutoTokenizer.from_pretrained(
    "google-t5/t5-base"
    )
model = AutoModelForSeq2SeqLM.from_pretrained(
    "google-t5/t5-base",
    device_map="auto"
    )

input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to(model.device)

output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))

</hfoption> </hfoptions>

Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the Quantization overview for more available quantization backends.

The example below uses torchao to only quantize the weights to int4.

python

# pip install torchao
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, TorchAoConfig


quantization_config = TorchAoConfig("int4_weight_only", group_size=128)
model = AutoModelForSeq2SeqLM.from_pretrained(
    "google/t5-v1_1-xl",
    device_map="auto",
    quantization_config=quantization_config
)

tokenizer = AutoTokenizer.from_pretrained("google/t5-v1_1-xl")
input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to(model.device)

output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))

Notes

You can pad the encoder inputs on the left or right because T5 uses relative scalar embeddings.
T5 models need a slightly higher learning rate than the default used in [Trainer]. Typically, values of 1e-4 and 3e-4 work well for most tasks.

T5Config

[[autodoc]] T5Config

T5Tokenizer

[[autodoc]] T5Tokenizer - get_special_tokens_mask - save_vocabulary

T5TokenizerFast

[[autodoc]] T5TokenizerFast

T5Model

[[autodoc]] T5Model - forward

T5ForConditionalGeneration

[[autodoc]] T5ForConditionalGeneration - forward

T5EncoderModel

[[autodoc]] T5EncoderModel - forward

T5ForSequenceClassification

[[autodoc]] T5ForSequenceClassification - forward

T5ForTokenClassification

[[autodoc]] T5ForTokenClassification - forward

T5ForQuestionAnswering

[[autodoc]] T5ForQuestionAnswering - forward