docs/source/en/model_doc/minicpm3.md
This model was published in HF papers on 2024-09-05 and contributed to Hugging Face Transformers on 2026-06-22.
MiniCPM3 is the third-generation MiniCPM dense language model from OpenBMB. The 4B variant
(openbmb/MiniCPM3-4B) outperforms many 7B–9B open
models on standard benchmarks while remaining lightweight enough for on-device usage.
MiniCPM3 combines several architectural ideas:
scale_emb — scales input embeddings.scale_depth / sqrt(num_hidden_layers) — scales residual connections.hidden_size / dim_model_base — scales hidden states before the language model head.from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("openbmb/MiniCPM3-4B")
model = AutoModelForCausalLM.from_pretrained("openbmb/MiniCPM3-4B", device_map="auto")
inputs = tokenizer("Hello, my name is", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=32, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
[[autodoc]] MiniCPM3Config
[[autodoc]] MiniCPM3Model - forward
[[autodoc]] MiniCPM3ForCausalLM - forward
[[autodoc]] MiniCPM3ForSequenceClassification - forward