docs/source/en/model_doc/mpt.md
This model was released on 2023-05-05 and added to Hugging Face Transformers on 2023-07-25.
The MPT model was proposed by the MosaicML team and released with multiple sizes and finetuned variants. The MPT models are a series of open source and commercially usable LLMs pre-trained on 1T tokens.
MPT models are GPT-style decoder-only transformers with several improvements: performance-optimized layer implementations, architecture changes that provide greater training stability, and the elimination of context length limits by replacing positional embeddings with ALiBi.
The original code is available at the llm-foundry repository.
Read more about it in the release blogpost
trust_remote_code=True when calling from_pretrained.[[autodoc]] MptConfig - all
[[autodoc]] MptModel - forward
[[autodoc]] MptForCausalLM - forward
[[autodoc]] MptForSequenceClassification - forward
[[autodoc]] MptForTokenClassification - forward
[[autodoc]] MptForQuestionAnswering - forward