docs_new/docs/sglang-diffusion/compatibility_matrix.mdx
The table below shows every supported model and the optimizations supported for them.
The symbols used have the following meanings:
The HuggingFace Model ID can be passed directly to from_pretrained() methods, and sglang-diffusion will use the
optimal
default parameters when initializing and generating videos.
Optimization columns are abbreviated to keep the matrix readable:
Tea = TeaCacheTile = Sliding Tile AttentionSage = Sage AttentionVSA = Video Sparse AttentionSLA = Sparse Linear AttentionSageSLA = Sage Sparse Linear AttentionSVG2 = Sparse Video Gen 2Note:
pip install git+https://github.com/thu-ml/SpargeAttn.git --no-build-isolation--pipeline-class-name LTX2Pipeline--pipeline-class-name LTX2TwoStagePipeline--pipeline-class-name LTX2TwoStageHQPipeline (HQ defaults to 1920×1088; you can still override --width/--height)--image-path) on one-stage and two-stage pipelines (including HQ).--spatial-upsampler-path and --distilled-lora-path.Resolutions column uses output video width×height semantics, matching sglang generate --width ... --height ....--ltx2-two-stage-device-mode {original,snapshot,resident}:
snapshot is the default and recommended mode.resident usually provides the best latency/throughput but uses much more VRAM.original keeps official two-stage semantics without the premerged stage-2 transformer path.original 154.67s, snapshot 114.05s, resident 75.71s; peak VRAM trend is original < snapshot < resident.SGLang Diffusion supports overriding individual pipeline components with
--<component>-path. The value can be either a Hugging Face repo ID or a local
component directory.
The same overrides can also be provided in config files through
component_paths.<component>.
CLI:
sglang generate \
--model-path black-forest-labs/FLUX.2-dev \
--vae-path black-forest-labs/FLUX.2-small-decoder \
--transformer-path /models/flux2/transformer
Config file:
model_path: black-forest-labs/FLUX.2-dev
component_paths:
vae: black-forest-labs/FLUX.2-small-decoder
transformer: /models/flux2/transformer
Use the component name from the pipeline's model_index.json or the native pipeline's registered module name:
The table below lists concrete Hugging Face component repos that are already used in SGLang Diffusion docs or tests. It is not an exhaustive catalog of all compatible component repos.
<table style={{width: "100%", borderCollapse: "collapse", tableLayout: "fixed"}}> <colgroup> <col style={{width: "24%"}} /> <col style={{width: "20%"}} /> <col style={{width: "28%"}} /> <col style={{width: "28%"}} /> </colgroup> <thead> <tr style={{borderBottom: "2px solid #d55816"}}> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Base Model</th> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Override Key</th> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Example Repo</th> <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Notes</th> </tr> </thead> <tbody> <tr> <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}><code>black-forest-labs/FLUX.2-dev</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}><code>vae</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}><code>black-forest-labs/FLUX.2-small-decoder</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>Decoder-only FLUX.2 VAE override</td> </tr> <tr> <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}><code>black-forest-labs/FLUX.2-dev</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}><code>vae</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}><code>fal/FLUX.2-Tiny-AutoEncoder</code></td> <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>Existing tested custom VAE path</td> </tr> </tbody> </table>--vae-path is the common image-generation override.--video-vae-path and --audio-vae-path are only relevant for pipelines with separate video or audio VAEs.--transformer-path is the standard override for the main denoising transformer.--transformer-path or --transformer-weights-path; see quantization.md.--video-dit-path and --audio-dit-path are only for pipelines that split denoisers by modality.--text-encoder-path and --text-encoder-2-path override primary and secondary text encoders.--tokenizer-path, --processor-path, and --image-processor-path are useful when the replacement encoder requires matching preprocessing assets.--scheduler-path is only relevant when the pipeline exposes a scheduler component.--spatial-upsampler-path is mainly for two-stage pipelines such as LTX2TwoStagePipeline.--vocoder-path, --connectors-path, --dual-tower-bridge-path, --image-encoder-path, and --vision-language-encoder-path are only valid for pipelines that expose those components.model_index.json or the native pipeline's registered module name.This section lists example LoRAs that have been explicitly tested and verified with each base model in the SGLang Diffusion pipeline.
<Info> LoRAs that are not listed here are not necessarily incompatible. In practice, most standard LoRAs are expected to work, especially those following common Diffusers or SD-style conventions. The entries below simply reflect configurations that have been manually validated by the SGLang team. </Info>