Back to Vllm

CPU - Intel® Xeon®

docs/models/hardware_supported_models/cpu.md

0.22.03.0 KB
Original Source

CPU - Intel® Xeon®

Validated Hardware

Hardware
Intel® Xeon® 6 Processors
Intel® Xeon® 5 Processors

Text-only Language Models

ModelArchitectureSupported
unsloth/gpt-oss-20bGptOssForCausalLM
meta-llama/Llama-3.1-8B-InstructLlamaForCausalLM
meta-llama/Llama-3.2-1BLlamaForCausalLM
meta-llama/Llama-3.2-3B-InstructLlamaForCausalLM
meta-llama/Llama-3.3-70B-InstructLlamaForCausalLM
RedHatAI/Meta-Llama-3.1-8B-quantized.w8a8LlamaForCausalLM
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8LlamaForCausalLM
RedHatAI/Llama-3.2-1B-Instruct-quantized.w8a8LlamaForCausalLM
RedHatAI/Llama-3.2-3B-Instruct-quantized.w8a8LlamaForCausalLM
RedHatAI/DeepSeek-R1-Distill-Llama-70B-quantized.w8a8LlamaForCausalLM
hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4LlamaForCausalLM
AMead10/Llama-3.2-1B-Instruct-AWQLlamaForCausalLM
AMead10/Llama-3.2-3B-Instruct-AWQLlamaForCausalLM
TheBloke/TinyLlama-1.1B-Chat-v1.0-AWQLlamaForCausalLM
TheBloke/TinyLlama-1.1B-Chat-v1.0-GPTQLlamaForCausalLM
ibm-granite/granite-3.2-2b-instructGraniteForCausalLM
Qwen/Qwen3-1.7BQwen3ForCausalLM
Qwen/Qwen3-4BQwen3ForCausalLM
Qwen/Qwen3-8BQwen3ForCausalLM
Qwen/Qwen3-14BQwen3ForCausalLM
Qwen/Qwen3-14B-AWQQwen3ForCausalLM
Qwen/Qwen3-30B-A3BQwen3MoeForCausalLM
Qwen/QwQ-32B-AWQQwen2ForCausalLM
Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4Qwen2ForCausalLM
RedHatAI/QwQ-32B-quantized.w8a8Qwen2ForCausalLM
zai-org/glm-4-9b-hfGLMForCausalLM
google/gemma-7bGemmaForCausalLM
microsoft/Phi-4-reasoningPhi3ForCausalLM
TheBloke/Mistral-7B-Instruct-v0.2-AWQMistralForCausalLM

Multimodal Language Models

ModelArchitectureSupported
meta-llama/Llama-4-Scout-17B-16E-InstructLlama4ForConditionalGeneration
google/gemma-3-4b-itGemma3ForConditionalGeneration
google/gemma-3-12b-itGemma3ForConditionalGeneration
google/gemma-4-E4B-itGemma4ForConditionalGeneration
google/gemma-4-E2B-itGemma4ForConditionalGeneration
google/gemma-4-26B-A4B-itGemma4ForConditionalGeneration
microsoft/Phi-4-multimodal-instructPhi4MMForCausalLM
Qwen/Qwen2.5-VL-7B-InstructQwen2VLForConditionalGeneration
openai/whisper-large-v3WhisperForConditionalGeneration

✅ Runs and optimized.