This model was released on 2025-04-29 and added to Hugging Face Transformers on 2025-03-31.

</div>

</div>

Qwen3MoE

Qwen3MoE is the mixture-of-experts variant in the Qwen3 family, with 30.5B total parameters and 3.3B active parameters per token. It uses 128 routed experts with 8 activated per token across 48 layers, and supports up to 131K context with YaRN. See also the dense variant Qwen3.

The example below demonstrates how to generate text with [Pipeline] or the [AutoModelForCausalLM] class.

python

from transformers import pipeline


pipe = pipeline(
    task="text-generation",
    model="Qwen/Qwen3-30B-A3B",
)
pipe("The key to effective reasoning is")

</hfoption> <hfoption id="AutoModelForCausalLM">

python

from transformers import AutoModelForCausalLM, AutoTokenizer


tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-30B-A3B")
model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-30B-A3B",
    device_map="auto",
)
input_ids = tokenizer("The key to effective reasoning is", return_tensors="pt").to(model.device)

output = model.generate(**input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))

</hfoption> </hfoptions>

Qwen3MoeConfig

[[autodoc]] Qwen3MoeConfig

Qwen3_5MoeVisionConfig

[[autodoc]] Qwen3_5MoeVisionConfig

Qwen3MoeModel

[[autodoc]] Qwen3MoeModel - forward

Qwen3MoeForCausalLM

[[autodoc]] Qwen3MoeForCausalLM - forward

Qwen3MoeForSequenceClassification

[[autodoc]] Qwen3MoeForSequenceClassification - forward

Qwen3MoeForTokenClassification

[[autodoc]] Qwen3MoeForTokenClassification - forward

Qwen3MoeForQuestionAnswering

[[autodoc]] Qwen3MoeForQuestionAnswering - forward