docs_new/docs/hardware-platforms/ascend-npus/mindspore_backend.mdx
MindSpore is a high-performance AI framework optimized for Ascend NPUs. This doc guides users to run MindSpore models in SGLang.
MindSpore currently only supports Ascend NPU devices. Users need to first install Ascend CANN software packages. The CANN software packages can be downloaded from the Ascend Official Website. The recommended version is 8.3.RC2.
Currently, the following models are supported:
Current SGLang-MindSpore supports Qwen3 and DeepSeek V3/R1 models. This doc uses Qwen3-8B as an example.
Use the following script for offline infer:
<CodeGroup> ```python Offline Inference import sglang as sglllm = sgl.Engine( model_path="/path/to/your/model", # Local model path device="npu", # Use NPU device model_impl="mindspore", # MindSpore implementation attention_backend="ascend", # Attention backend tp_size=1, # Tensor parallelism size dp_size=1 # Data parallelism size )
prompts = [ "Hello, my name is", "The capital of France is", "The future of AI is" ]
sampling_params = {"temperature": 0, "top_p": 0.9} outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs): print(f"Prompt: {prompt}") print(f"Generated: {output['text']}") print("---")
</CodeGroup>
### Start server
Launch a server with MindSpore backend:
<CodeGroup>
```bash Launch Server
# Basic server startup
python3 -m sglang.launch_server \
--model-path /path/to/your/model \
--host 0.0.0.0 \
--device npu \
--model-impl mindspore \
--attention-backend ascend \
--tp-size 1 \
--dp-size 1
For distributed server with multiple nodes:
<CodeGroup> ```bash Multi-node Distributed # Multi-node distributed server python3 -m sglang.launch_server \ --model-path /path/to/your/model \ --host 0.0.0.0 \ --device npu \ --model-impl mindspore \ --attention-backend ascend \ --dist-init-addr 127.0.0.1:29500 \ --nnodes 2 \ --node-rank 0 \ --tp-size 4 \ --dp-size 2 ``` </CodeGroup>Enable sglang debug logging by log-level argument.
<CodeGroup> ```bash Debug Mode python3 -m sglang.launch_server \ --model-path /path/to/your/model \ --host 0.0.0.0 \ --device npu \ --model-impl mindspore \ --attention-backend ascend \ --log-level DEBUG ``` </CodeGroup>Enable mindspore info and debug logging by setting environments.
<CodeGroup> ```bash Set Log Level export GLOG_v=1 # INFO export GLOG_v=0 # DEBUG ``` </CodeGroup>Use the following environment variable to explicitly select the devices to use.
<CodeGroup> ```shell Select Devices export ASCEND_RT_VISIBLE_DEVICES=4,5,6,7 # to set device ``` </CodeGroup>In case of some environment with special communication environment, users need set some environment variables.
<CodeGroup> ```shell Disable LCCL export MS_ENABLE_LCCL=off # current not support LCCL communication mode in SGLang-MindSpore ``` </CodeGroup>In case of some environment with special protobuf version, users need set some environment variables to avoid binary version mismatch.
<CodeGroup> ```shell Fix Protobuf export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python # to avoid protobuf binary version mismatch ``` </CodeGroup>For MindSpore-specific issues: