docs_new/docs/sglang-diffusion/compatibility_matrix.mdx
This page tracks supported SGLang Diffusion model families and their optimization compatibility. It also covers long-tail models that do not yet have dedicated cookbook recipes.
For model-specific usage recipes, start from the Diffusion Cookbook. Cookbook pages cover the primary models with examples; this page keeps the compact support and compatibility inventory.
Pass the Hugging Face Model ID to --model-path for sglang generate or sglang serve. Python API users can pass the same ID to SGLang Diffusion model-loading helpers.
Missing checkpoint aliases do not imply that a model family is unsupported. The runtime registry may also accept detector-based aliases or local model directories that match the same family.
Rows are grouped when a family shares the same runtime path or optimization support. Use the detailed matrix below when you need per-optimization compatibility.
<Tabs> <Tab title="Image"> <div className="sgd-model-table-wrap"> <table className="sgd-model-table"> <thead> <tr> <th>Model family</th> <th>Model IDs</th> </tr> </thead> <tbody> <tr> <td>FLUX</td> <td><div className="sgd-id-list"><code>black-forest-labs/FLUX.1-dev</code><code>black-forest-labs/FLUX.2-dev</code><code>black-forest-labs/FLUX.2-dev-NVFP4</code><code>black-forest-labs/FLUX.2-klein-4B</code><code>black-forest-labs/FLUX.2-klein-9B</code><code>black-forest-labs/FLUX.2-klein-base-4B</code><code>black-forest-labs/FLUX.2-klein-base-9B</code></div></td> </tr> <tr> <td>Z-Image</td> <td><div className="sgd-id-list"><code>Tongyi-MAI/Z-Image</code><code>Tongyi-MAI/Z-Image-Turbo</code></div></td> </tr> <tr> <td>Qwen-Image</td> <td><div className="sgd-id-list"><code>Qwen/Qwen-Image</code><code>Qwen/Qwen-Image-2512</code><code>Qwen/Qwen-Image-Edit</code><code>Qwen/Qwen-Image-Edit-2509</code><code>Qwen/Qwen-Image-Edit-2511</code><code>Qwen/Qwen-Image-Layered</code></div></td> </tr> <tr> <td>SD3 / SD3.5</td> <td><div className="sgd-id-list"><code>stabilityai/stable-diffusion-3-medium</code><code>stabilityai/stable-diffusion-3-medium-diffusers</code><code>stabilityai/stable-diffusion-3.5-medium</code><code>stabilityai/stable-diffusion-3.5-medium-diffusers</code><code>stabilityai/stable-diffusion-3.5-large</code><code>stabilityai/stable-diffusion-3.5-large-diffusers</code></div></td> </tr> <tr> <td>SANA</td> <td><div className="sgd-id-list"><code>Efficient-Large-Model/SANA1.5_1.6B_1024px_diffusers</code><code>Efficient-Large-Model/SANA1.5_4.8B_1024px_diffusers</code><code>Efficient-Large-Model/Sana_1600M_1024px_diffusers</code><code>Efficient-Large-Model/Sana_600M_1024px_diffusers</code><code>Efficient-Large-Model/Sana_1600M_512px_diffusers</code><code>Efficient-Large-Model/Sana_600M_512px_diffusers</code></div></td> </tr> <tr> <td>FireRed-Image</td> <td><div className="sgd-id-list"><code>FireRedTeam/FireRed-Image-Edit-1.0</code><code>FireRedTeam/FireRed-Image-Edit-1.1</code></div></td> </tr> <tr> <td>JoyAI-Image</td> <td><div className="sgd-id-list"><code>jdopensource/JoyAI-Image-Edit-Diffusers</code></div></td> </tr> <tr> <td>Other image pipelines</td> <td><div className="sgd-id-list"><code>zai-org/GLM-Image</code><code>tencent/Hunyuan3D-2</code><code>baidu/ERNIE-Image</code><code>baidu/ERNIE-Image-Turbo</code><code>ideogram-ai/ideogram-4-fp8</code><code>ideogram-ai/ideogram-4-nf4</code><code>Comfy-Org/Ideogram-4</code></div></td> </tr> </tbody> </table> </div> </Tab> <Tab title="Video"> <div className="sgd-model-table-wrap"> <table className="sgd-model-table"> <thead> <tr> <th>Model family</th> <th>Model IDs</th> <th>Resolution / mode</th> <th>Optimization support</th> </tr> </thead> <tbody> <tr> <td>FastWan</td> <td><div className="sgd-id-list"><code>FastVideo/FastWan2.1-T2V-1.3B-Diffusers</code><code>FastVideo/FastWan2.2-TI2V-5B-FullAttn-Diffusers</code><code>FastVideo/FastWan2.2-TI2V-5B-Diffusers</code></div></td> <td>480p / 720p</td> <td><span className="sgd-chip">VSA</span></td> </tr> <tr> <td>Wan2.2</td> <td><div className="sgd-id-list"><code>Wan-AI/Wan2.2-TI2V-5B-Diffusers</code><code>Wan-AI/Wan2.2-T2V-A14B-Diffusers</code><code>nvidia/Wan2.2-T2V-A14B-Diffusers-NVFP4</code><code>Wan-AI/Wan2.2-I2V-A14B-Diffusers</code></div></td> <td>TI2V / T2V / I2V, 480p / 720p</td> <td><span className="sgd-chip">Sage</span><span className="sgd-chip">Laser</span><span className="sgd-chip">BSA</span><span className="sgd-chip">Rain Fusion</span></td> </tr> <tr> <td>HunyuanVideo</td> <td><div className="sgd-id-list"><code>hunyuanvideo-community/HunyuanVideo</code><code>FastVideo/FastHunyuan-diffusers</code></div></td> <td>720×1280 / 544×960</td> <td><span className="sgd-chip">Tile</span><span className="sgd-chip">Sage</span><span className="sgd-chip">SVG2</span></td> </tr> <tr> <td>Wan2.1</td> <td><div className="sgd-id-list"><code>Wan-AI/Wan2.1-T2V-1.3B-Diffusers</code><code>Wan-AI/Wan2.1-T2V-14B-Diffusers</code><code>Wan-AI/Wan2.1-I2V-14B-480P-Diffusers</code><code>Wan-AI/Wan2.1-I2V-14B-720P-Diffusers</code></div></td> <td>T2V / I2V, 480p / 720p</td> <td><span className="sgd-chip">TeaCache</span><span className="sgd-chip">Tile</span><span className="sgd-chip">Sage</span><span className="sgd-chip">SVG2</span><span className="sgd-chip">Laser</span><span className="sgd-chip">BSA</span><span className="sgd-chip">Rain Fusion</span></td> </tr> <tr> <td>TurboWan</td> <td><div className="sgd-id-list"><code>IPostYellow/TurboWan2.1-T2V-1.3B-Diffusers</code><code>IPostYellow/TurboWan2.1-T2V-14B-Diffusers</code><code>IPostYellow/TurboWan2.1-T2V-14B-720P-Diffusers</code><code>IPostYellow/TurboWan2.2-I2V-A14B-Diffusers</code></div></td> <td>480p / 720p</td> <td><span className="sgd-chip">TeaCache</span><span className="sgd-chip">SLA</span><span className="sgd-chip">SageSLA</span></td> </tr> <tr> <td>MOVA</td> <td><div className="sgd-id-list"><code>OpenMOSS-Team/MOVA-360p</code><code>OpenMOSS-Team/MOVA-720p</code></div></td> <td>Video-audio, 360p / 720p; local MOVA detector aliases are also supported.</td> <td><span className="sgd-muted">No dedicated optimization listed</span></td> </tr> <tr> <td>Wan2.1 Fun</td> <td><div className="sgd-id-list"><code>weizhou03/Wan2.1-Fun-1.3B-InP-Diffusers</code></div></td> <td>480p inpainting</td> <td><span className="sgd-chip">TeaCache</span><span className="sgd-chip">Tile</span><span className="sgd-chip">Sage</span><span className="sgd-chip">SVG2</span></td> </tr> <tr> <td>Helios</td> <td><div className="sgd-id-list"><code>BestWishYsh/Helios-Base</code><code>BestWishYsh/Helios-Mid</code><code>BestWishYsh/Helios-Distilled</code></div></td> <td>720p</td> <td><span className="sgd-muted">No dedicated optimization listed</span></td> </tr> <tr> <td>LTX-2</td> <td><div className="sgd-id-list"><code>Lightricks/LTX-2</code><code>Lightricks/LTX-2.3</code></div></td> <td>One-stage, two-stage, TI2V, HQ</td> <td><span className="sgd-muted">No dedicated optimization listed</span></td> </tr> <tr> <td>Cosmos3</td> <td><div className="sgd-id-list"><code>nvidia/Cosmos3-Nano</code><code>nvidia/Cosmos3-Super</code><code>nvidia/Cosmos3-Super-Text2Image</code><code>nvidia/Cosmos3-Super-Image2Video</code></div></td> <td>T2V / I2V / T2I</td> <td><span className="sgd-muted">No dedicated optimization listed</span></td> </tr> </tbody> </table> </div> </Tab> <Tab title="Realtime / World"> <div className="sgd-model-table-wrap"> <table className="sgd-model-table"> <thead> <tr> <th>Model family</th> <th>Model IDs / detector</th> <th>Notes</th> </tr> </thead> <tbody> <tr> <td>LingBotWorld</td> <td><div className="sgd-id-list"><code>robbyant/lingbot-world-fast-diffusers</code></div></td> <td>Realtime world model with causal state and control tokens.</td> </tr> <tr> <td>SANA-WM</td> <td><div className="sgd-id-list"><code>Efficient-Large-Model/SANA-WM_bidirectional</code><code>Efficient-Large-Model/SANA-WM_streaming</code></div></td> <td>World-model pipeline with bidirectional and streaming checkpoints.</td> </tr> </tbody> </table> </div> </Tab> </Tabs> <Note> Wan2.2 TI2V 5B currently has known quality issues when used for I2V generation. </Note>The detailed video matrix uses these symbols:
Optimization columns are abbreviated to keep the matrix readable:
Tea = TeaCacheTile = Sliding Tile AttentionSage = Sage AttentionVSA = Video Sparse AttentionSLA = Sparse Linear AttentionSageSLA = Sage Sparse Linear AttentionSVG2 = Sparse Video Gen 2LA = Laser AttentionBSA = Block Sparse AttentionRF = Rain Fusion AttentionNote:
pip install git+https://github.com/thu-ml/SpargeAttn.git --no-build-isolation--pipeline-class-name LTX2Pipeline--pipeline-class-name LTX2TwoStagePipeline--pipeline-class-name LTX2TwoStageHQPipeline (HQ defaults to 1920×1088; you can still override --width/--height)--image-path) on one-stage and two-stage pipelines (including HQ).--spatial-upsampler-path and --distilled-lora-path.Resolutions column uses output video width×height semantics, matching sglang generate --width ... --height ....--ltx2-two-stage-device-mode {original,resident}:
original keeps official two-stage semantics without the premerged stage-2 transformer path.resident usually provides the best latency/throughput but uses much more VRAM.resident on H200/high-memory CUDA GPUs, otherwise original.snapshot is accepted as an alias for original and may be removed after two release cycles.nvidia/Cosmos3-Nano (8B) and
nvidia/Cosmos3-Super (32B). Both share the same pipeline; the only
difference is transformer depth and width, picked up from
transformer/config.json at load time. A single checkpoint serves T2V,
I2V (--image-path), and T2I (--num-frames 1).SGLang Diffusion supports overriding individual pipeline components with
--<component>-path. The value can be either a Hugging Face repo ID or a local
component directory.
The same overrides can also be provided in config files through
component_paths.<component>.
CLI:
sglang generate \
--model-path black-forest-labs/FLUX.2-dev \
--vae-path black-forest-labs/FLUX.2-small-decoder \
--transformer-path /models/flux2/transformer
Config file:
model_path: black-forest-labs/FLUX.2-dev
component_paths:
vae: black-forest-labs/FLUX.2-small-decoder
transformer: /models/flux2/transformer
Use the component name from the pipeline's model_index.json or the native pipeline's registered module name:
The table below lists concrete Hugging Face component repos that are already used in SGLang Diffusion docs or tests. It is not an exhaustive catalog of all compatible component repos.
<table style={{width: "100%", borderCollapse: "collapse", tableLayout: "fixed"}}> <colgroup> <col style={{width: "24%"}} /> <col style={{width: "20%"}} /> <col style={{width: "28%"}} /> <col style={{width: "28%"}} /> </colgroup> <thead> <tr style={{borderBottom: "2px solid #d55816"}}> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Base Model</th> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Override Key</th> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Example Repo</th> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Notes</th> </tr> </thead> <tbody> <tr> <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}><code>black-forest-labs/FLUX.2-dev</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}><code>vae</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}><code>black-forest-labs/FLUX.2-small-decoder</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>Decoder-only FLUX.2 VAE override</td> </tr> <tr> <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}><code>black-forest-labs/FLUX.2-dev</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}><code>vae</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}><code>fal/FLUX.2-Tiny-AutoEncoder</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>Existing tested custom VAE path</td> </tr> </tbody> </table>--vae-path is the common image-generation override.--video-vae-path and --audio-vae-path are only relevant for pipelines with separate video or audio VAEs.--transformer-path is the standard override for the main denoising transformer.--transformer-path or --transformer-weights-path; see quantization.md.--video-dit-path and --audio-dit-path are only for pipelines that split denoisers by modality.--text-encoder-path and --text-encoder-2-path override primary and secondary text encoders.--tokenizer-path, --processor-path, and --image-processor-path are useful when the replacement encoder requires matching preprocessing assets.--scheduler-path is only relevant when the pipeline exposes a scheduler component.--spatial-upsampler-path is mainly for two-stage pipelines such as LTX2TwoStagePipeline.--vocoder-path, --connectors-path, --dual-tower-bridge-path, --image-encoder-path, and --vision-language-encoder-path are only valid for pipelines that expose those components.model_index.json or the native pipeline's registered module name.This section lists example LoRAs that have been explicitly tested and verified with each base model in the SGLang Diffusion pipeline.
<Info> LoRAs that are not listed here are not necessarily incompatible. In practice, most standard LoRAs are expected to work, especially those following common Diffusers or SD-style conventions. The entries below simply reflect configurations that have been manually validated by the SGLang team. </Info>