docs/platforms/ascend/ascend_contribution_guide.md
Welcome to SGLang! We appreciate your interest in contributing. This guide provides a concise overview of how to set up your environment, run tests, build documentation, and open a Pull Request (PR). Whether you’re fixing a small bug or developing a major feature, we encourage following these steps for a smooth contribution process.
Before contributing, please ensure that your environment is set up correctly. Follow the steps in the Installation Guide to install the necessary dependencies. We recommend using docker to build the environment.
Note: New contributors do not have the write permission to push to the official SGLang repo. Please fork the repository under your GitHub account, then clone your fork locally.
git clone https://github.com/<your_user_name>/sglang.git
# if you are using docker, the environment is already set up.
cd sglang
export PYTHONPATH=$PWD/python:$PYTHONPATH
We use pre-commit to maintain consistent code style checks. Before pushing your changes, please run:
pip3 install pre-commit
pre-commit install
pre-commit run --all-files
pre-commit run --all-files manually runs all configured checks, applying fixes if possible. If it fails the first time, re-run it to ensure lint errors are fully resolved. Make sure your code passes all checks before creating a Pull Request.main branch. Always create a new branch (e.g., feature/my-new-feature), push your changes, and open a PR from that branch.If you add a new feature or fix a bug, please add corresponding unit tests to ensure coverage and prevent regression. SGLang uses Python's built-in unittest framework. For detailed instructions on running tests and integrating them into CI, refer to test/README.md.
If you need to use model which is not in python/sglang/test/ascend/test_ascend_utils.py list. Follow these steps:
modelscope download
--model {your_model_repo}/{your_model}
--local_dir /data/ascend-ci-share-pkking-sglang/modelscope/hub/models/{your_model_repo}/{your_model}
Note: If you don’t have access to CI server, please ask maintainers ([email protected]) to download your model.
python/sglang/test/ascend/test_ascend_utils.py (use docker "/root/.cache/modelscope/hub/models/{your_model_repo}/{your_model}" path).We recommend new contributors start from writing documentation, which helps you quickly understand SGLang codebase. For more details, please refer to docs/README.md.
If your code changes the model output, please run the accuracy tests. A quick sanity check is the few-shot GSM8K.
# Launch a server
python3 -m sglang.launch_server --model Qwen/Qwen2-7B-Instruct
# Evaluate
python3 -m sglang.test.few_shot_gsm8k --num-questions 200
Please note that the above script is primarily a sanity check, not a rigorous accuracy or speed test. This test can have significant variance (1%–5%) in accuracy due to batching and the non-deterministic nature of the inference engine. Also, do not rely on the "Latency/Output throughput" from this script, as it is not a proper speed test.
GSM8K is too easy for state-of-the-art models nowadays. Please try your own more challenging accuracy tests. You can find additional accuracy eval examples in:
Refer to Benchmark and Profiling.
You can follow the pull request merge process described in MAINTAINER.md. You will need to work with the Merge Oncall, Codeowner, and other reviewers to get their approvals. Then your PR can be merged.
We have a lot of open PRs but limited CI machines, so only top and trusted contributors have permission to trigger CI tests. Users with permission are listed in the CI_PERMISSIONS.json
For CI to run on a pull request, it must have the "run-ci" label. Authorized users can add the label or rerun failed tests by commenting on the PR with one of these commands:
/tag-run-ci-label: Adds the "run-ci" label. Every future commit will trigger CI./rerun-failed-ci: Reruns the failed or flaky tests from the most recent commit./tag-and-rerun-ci: A single command that performs both /tag-run-ci-label and /rerun-failed-ci./rerun-stage <stage-name>: Reruns a specific test stage without waiting for its dependencies. This is useful when you want to quickly validate a fix for a specific test failure instead of waiting ~30 minutes for preceding stages to complete.If you have permission, the Slash Command Handler will run your command and react with a 👍 to your comment. It may take up to a few minutes for the reaction to appear. Here’s a usage example.
To avoid spamming a PR with too many /rerun-failed-ci comments, you can also trigger the command by editing an existing comment and adding any suffix (e.g., /rerun-failed-ci try again).
Example of rerunning a single test stage: /rerun-stage unit-test-backend-4-gpu.
If you don’t have permission, please ask maintainers to trigger CI for you.
Due to CI scheduling and limited resources, higher-priority PRs may preempt running jobs. In such cases, you may need to rerun the tests.
We apply CI rate limits to prevent abuse and ensure fair usage of our CI resources.
Each CI workflow has a default limit defined in its workflow configuration file. For example, in pr-gate.yml, the default cooldown period is 120 minutes, and each workflow can override it via the cool-down-minutes input parameter:
cool-down-minutes:
description: "Cooldown period in minutes for low-permission users; 0 disables rate limiting"
type: number
default: 120
Users listed in CI_PERMISSIONS.json may have a per-user cooldown interval. In practice, we use the minimum of the workflow’s default window and the user-specific interval.
tensor.item() or tensor.cpu(), whenever possible. Use vectorized code.scheduler.py, scheduler_output_processor_mixin.py)test_eagle_infer_a.py, test_eagle_infer_b.py).allocator_npu.py).Since sglang and sgl-kernel are separate Python packages, our current GitHub CI infrastructure does not support updating a kernel and using it immediately within the same pull request (PR).
To add a new kernel or modify an existing one in the sgl-kernel/ source tree, you must use multiple PRs.
Follow these steps:
sglang-kernel wheel to PyPI.sglang-kernel version in sglang/python/pyproject.toml to use the modified kernels.Sgl-kernel-npu is the kernel package for Ascend NPU and is maintained in the sgl-kernel-npu repository. if you want to add a new kernel and want to use it in sglang, please follow the steps in Contribution Guide.
If you want to contribute but don’t have a specific idea in mind, pick issues labeled “good first issue” or “help wanted”. These tasks typically have lower complexity and provide an excellent introduction to the codebase. Also check out this code walk-through for a deeper look into SGLang’s workflow.
If you have any questions or want to start a discussion, please feel free to ask in our Slack channel.
Thank you for your interest in SGLang. Happy coding!