Back to Sglang

SGLang Cookbook

docs_new/cookbook/intro copy.mdx

0.5.119.0 KB
Original Source

A community-maintained repository of practical guides and recipes for deploying and using SGLang in production environments. Our mission is simple: answer the question "How do I use SGLang (and related models) on hardware Y for task Z?" with clear, actionable solutions.

🎯 What You'll Find Here

This cookbook aggregates battle-tested SGLang recipes covering:

  • Models: Mainstream LLMs and Vision-Language Models (VLMs)
  • Use Cases: Inference serving, deployment strategies, multimodal applications
  • Hardware: GPU and CPU configurations, optimization for different accelerators
  • Best Practices: Configuration templates, performance tuning, troubleshooting guides

Each recipe provides step-by-step instructions to help you quickly implement SGLang solutions for your specific requirements.

Guides

Autoregressive Models

Qwen

  • Qwen3.5 <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • Qwen3
  • Qwen3-Next
  • Qwen3-VL
  • Qwen3-Coder
  • Qwen3-Coder-Next <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • Qwen2.5-VL

DeepSeek

Llama

GLM

  • GLM-Glyph
  • GLM-5 <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • GLM-OCR <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • GLM-4.5
  • GLM-4.5V
  • GLM-4.6
  • GLM-4.6V
  • GLM-4.7
  • GLM-4.7-Flash <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>

OpenAI

Moonshotai

  • Kimi-K2.6 <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • Kimi-K2.5
  • Kimi-K2
  • Kimi-Linear

MiniMax

  • MiniMax-M2
  • MiniMax-M2.5 <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>

NVIDIA

Ernie

InternVL

InternLM

Jina AI

Mistral

Xiaomi

FlashLabs

  • Chroma 1.0<span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>

StepFun

  • Step-3.5-Flash <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • Step3-VL-10B <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>

InclusionAI

  • Ling-2.5-1T <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • Ring-2.5-1T <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>
  • LLaDA-2.1 <span style={{backgroundColor: '#fde8e2', color: '#C5602D', padding: '2px 8px', borderRadius: '4px', fontSize: '12px', fontWeight: 'normal', marginLeft: '8px'}}>NEW</span>

Diffusion Models

FLUX

Qwen-Image

Wan

Z-Image

Benchmarks

Reference

🚀 Quick Start

  1. Browse the recipe index above to find your model
  2. Follow the step-by-step instructions in each guide
  3. Adapt configurations to your specific hardware and requirements
  4. Join our community to share feedback and improvements

🤝 Contributing

We believe the best documentation comes from practitioners. Whether you've optimized SGLang for a specific model, solved a tricky deployment challenge, or discovered performance improvements, we encourage you to contribute your recipes!

Ways to contribute:

  • Add a new recipe for a model not yet covered
  • Improve existing recipes with additional tips or configurations
  • Report issues or suggest enhancements
  • Share your production deployment experiences

To contribute:

<CodeGroup> ```bash Contribute a Recipe # Fork the repo and clone locally git clone https://github.com/YOUR_USERNAME/sglang-cookbook.git cd sglang-cookbook

Create a new branch

git checkout -b add-my-recipe

Add your recipe following the template in DeepSeek-V3.2

Submit a PR!

</CodeGroup>

## 🛠️ Local Development

### Prerequisites

- Node.js >= 20.0
- npm or yarn

### Setup and Run

Install dependencies and start the development server:

<CodeGroup>
```bash Local Development
# Install dependencies
npm install

# Start development server (hot reload enabled)
npm start
</CodeGroup>

The site will automatically open in your browser at http://localhost:3000.

📖 Resources

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


Let's build this resource together! 🚀 Star the repo and contribute your recipes to help the SGLang community grow.