docs/contributing/ci/failures.md
What should I do when a CI job fails on my PR, but I don't think my PR caused the failure?
Check the dashboard of current CI test failures:
๐ CI Failures Dashboard
If your failure is already listed, it's likely unrelated to your PR. Help fixing it is always welcome!
If your failure is not listed, you should file an issue.
File a bug report:
๐ New CI Failure Report
Use this title format:
[CI Failure]: failing-test-job - regex/matching/failing:test
For the environment field:
Still failing on main as of commit abcdef123
In the description, include failing tests:
FAILED failing/test.py:failing_test1 - Failure description
FAILED failing/test.py:failing_test2 - Failure description
https://github.com/orgs/vllm-project/projects/20
https://github.com/vllm-project/vllm/issues/new?template=400-bug-report.yml
FAILED failing/test.py:failing_test3 - Failure description
Attach logs (collapsible section example):
<details> <summary>Logs:</summary>ERROR 05-20 03:26:38 [dump_input.py:68] Dumping input data
--- Logging error ---
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 203, in execute_model
return self.model_executor.execute_model(scheduler_output)
...
FAILED failing/test.py:failing_test1 - Failure description
FAILED failing/test.py:failing_test2 - Failure description
FAILED failing/test.py:failing_test3 - Failure description
Logs are public; no Buildkite login needed.
.buildkite/scripts/ci-fetch-log.sh
saves each log as ci-<build>-<job-name>.log, stripped of timestamps and
ANSI codes:
# All failed jobs in a PR's latest build (current branch's PR if omitted):
.buildkite/scripts/ci-fetch-log.sh --pr <PR>
# All failed jobs in a build (--soft also includes soft-failed jobs;
# --all fetches every finished job):
.buildkite/scripts/ci-fetch-log.sh "https://buildkite.com/vllm/ci/builds/<N>"
# One job โ `gh pr checks` URLs (#<job_uuid>) and web UI URLs (?sid=) both
# work; pass "-" as a second argument to stream to stdout:
.buildkite/scripts/ci-fetch-log.sh "https://buildkite.com/vllm/ci/builds/<N>#<job_uuid>"
To clean an already-downloaded log:
.buildkite/scripts/ci-clean-log.sh
./ci-clean-log.sh ci.log
Use a tool wl-clipboard for quick copy-pasting:
tail -525 ci_build.log | wl-copy
CI test failures may be flaky. Use a bash loop to run repeatedly:
.buildkite/scripts/rerun-test.sh
./rerun-test.sh tests/v1/engine/test_engine_core_client.py::test_kv_cache_events[True-tcp]
If you submit a PR to fix a CI failure:
Closes #12345 to the PR description.ci-failure label:
This helps track it in the CI Failures GitHub Project.Use Buildkite analytics (2-day view) to:
main.Compare to the CI Failures Dashboard.