docs/users/features/headless.md
Headless mode allows you to run Qwen Code programmatically from command line scripts and automation tools without any interactive UI. This is ideal for scripting, automation, CI/CD pipelines, and building AI-powered tools.
The headless mode provides a headless interface to Qwen Code that:
Use the --prompt (or -p) flag to run in headless mode:
qwen --prompt "What is machine learning?"
Pipe input to Qwen Code from your terminal:
echo "Explain this code" | qwen
Read from files and process with Qwen Code:
cat README.md | qwen --prompt "Summarize this documentation"
Reuse conversation context from the current project in headless scripts:
# Continue the most recent session for this project and run a new prompt
qwen --continue -p "Run the tests again and summarize failures"
# Resume a specific session ID directly (no UI)
qwen --resume 123e4567-e89b-12d3-a456-426614174000 -p "Apply the follow-up refactor"
[!note]
- Session data is project-scoped JSONL under
~/.qwen/projects/<sanitized-cwd>/chats.- Restores conversation history, tool outputs, and chat-compression checkpoints before sending the new prompt.
You can change the main session system prompt for a single CLI run without editing shared memory files.
Use --system-prompt to replace Qwen Code's built-in main-session prompt for the current run:
qwen -p "Review this patch" --system-prompt "You are a terse release reviewer. Report only blocking issues."
Use --append-system-prompt to keep the built-in prompt and add extra instructions for this run:
qwen -p "Review this patch" --append-system-prompt "Be terse and focus on concrete findings."
You can combine both flags when you want a custom base prompt plus an extra run-specific instruction:
qwen -p "Summarize this repository" \
--system-prompt "You are a migration planner." \
--append-system-prompt "Return exactly three bullets."
[!note]
--system-promptapplies only to the current run's main session.- Loaded memory and context files such as
QWEN.mdare still appended after--system-prompt.--append-system-promptis applied after the built-in prompt and loaded memory, and can be used together with--system-prompt.
Qwen Code supports multiple output formats for different use cases:
Standard human-readable output:
qwen -p "What is the capital of France?"
Response format:
The capital of France is Paris.
Returns structured data as a JSON array. All messages are buffered and output together when the session completes. This format is ideal for programmatic processing and automation scripts.
The JSON output is an array of message objects. The output includes multiple message types: system messages (session initialization), assistant messages (AI responses), and result messages (execution summary).
qwen -p "What is the capital of France?" --output-format json
Output (at end of execution):
[
{
"type": "system",
"subtype": "session_start",
"uuid": "...",
"session_id": "...",
"model": "qwen3-coder-plus",
...
},
{
"type": "assistant",
"uuid": "...",
"session_id": "...",
"message": {
"id": "...",
"type": "message",
"role": "assistant",
"model": "qwen3-coder-plus",
"content": [
{
"type": "text",
"text": "The capital of France is Paris."
}
],
"usage": {...}
},
"parent_tool_use_id": null
},
{
"type": "result",
"subtype": "success",
"uuid": "...",
"session_id": "...",
"is_error": false,
"duration_ms": 1234,
"result": "The capital of France is Paris.",
"usage": {...}
}
]
Stream-JSON format emits JSON messages immediately as they occur during execution, enabling real-time monitoring. This format uses line-delimited JSON where each message is a complete JSON object on a single line.
qwen -p "Explain TypeScript" --output-format stream-json
Output (streaming as events occur):
{"type":"system","subtype":"session_start","uuid":"...","session_id":"..."}
{"type":"assistant","uuid":"...","session_id":"...","message":{...}}
{"type":"result","subtype":"success","uuid":"...","session_id":"..."}
When combined with --include-partial-messages, additional stream events are emitted in real-time (message_start, content_block_delta, etc.) for real-time UI updates.
qwen -p "Write a Python script" --output-format stream-json --include-partial-messages
The --input-format parameter controls how Qwen Code consumes input from standard input:
text (default): Standard text input from stdin or command-line argumentsstream-json: JSON message protocol via stdin for bidirectional communicationNote: Stream-json input mode is currently under construction and is intended for SDK integration. It requires
--output-format stream-jsonto be set.
Save output to files or pipe to other commands:
# Save to file
qwen -p "Explain Docker" > docker-explanation.txt
qwen -p "Explain Docker" --output-format json > docker-explanation.json
# Append to file
qwen -p "Add more details" >> docker-explanation.txt
# Pipe to other tools
qwen -p "What is Kubernetes?" --output-format json | jq '.response'
qwen -p "Explain microservices" | wc -w
qwen -p "List programming languages" | grep -i "python"
# Stream-JSON output for real-time processing
qwen -p "Explain Docker" --output-format stream-json | jq '.type'
qwen -p "Write code" --output-format stream-json --include-partial-messages | jq '.event.type'
Key command-line options for headless usage:
| Option | Description | Example |
|---|---|---|
--prompt, -p | Run in headless mode | qwen -p "query" |
--output-format, -o | Specify output format (text, json, stream-json) | qwen -p "query" --output-format json |
--input-format | Specify input format (text, stream-json) | qwen --input-format text --output-format stream-json |
--include-partial-messages | Include partial messages in stream-json output | qwen -p "query" --output-format stream-json --include-partial-messages |
--system-prompt | Override the main session system prompt for this run | qwen -p "query" --system-prompt "You are a terse reviewer." |
--append-system-prompt | Append extra instructions to the main session system prompt for this run | qwen -p "query" --append-system-prompt "Focus on concrete findings." |
--debug, -d | Enable debug mode | qwen -p "query" --debug |
--all-files, -a | Include all files in context | qwen -p "query" --all-files |
--include-directories | Include additional directories | qwen -p "query" --include-directories src,docs |
--yolo, -y | Auto-approve all actions | qwen -p "query" --yolo |
--approval-mode | Set approval mode | qwen -p "query" --approval-mode auto_edit |
--continue | Resume the most recent session for this project | qwen --continue -p "Pick up where we left off" |
--resume [sessionId] | Resume a specific session (or choose interactively) | qwen --resume 123e... -p "Finish the refactor" |
--max-session-turns | Cap the number of user/model/tool turns in the run | qwen -p "..." --max-session-turns 30 |
--max-wall-time | Wall-clock budget; accepts 90 (s), 30s, 5m, 1h, 1.5h | qwen -p "..." --max-wall-time 10m |
--max-tool-calls | Cumulative tool-call budget for the run | qwen -p "..." --max-tool-calls 50 |
For complete details on all available configuration options, settings files, and environment variables, see the Configuration Guide.
Headless / CI runs combined with --yolo (or --approval-mode=yolo) auto-approve every tool call, including shell, write, and edit. --yolo does not enable a sandbox — those tools run at the host process's privilege level. When Qwen Code detects this combination with no sandbox configured, it prints a one-line warning to stderr at startup. Suppress the warning with QWEN_CODE_SUPPRESS_YOLO_WARNING=1 once you've reviewed the trade-off.
Qwen Code can abort an unattended run when it crosses one of the following thresholds. Each is -1 (unlimited) by default; setting any one is enough to bound runaway behavior. They are enforced cooperatively against the same AbortController that already carries SIGINT, so a budget abort emits a structured FatalBudgetExceededError (exit code 55) — distinct from the turn-cap exit code 53 and SIGINT's 130 so CI scripts can branch on the reason.
| Flag | Settings key | What it bounds |
|---|---|---|
--max-wall-time | model.maxWallTimeSeconds | Wall-clock duration of the whole run. Flag accepts 90 (s), 30s, 5m, 1h, 1.5h (fractional units supported). Minimum 1s — sub-second values are rejected as typos. Settings is seconds. |
--max-tool-calls | model.maxToolCalls | Cumulative top-level tool calls dispatched by the main run loop (counts successes and failures — the model still consumes tokens on errors). See "Scope" below for subagent / structured-output exemptions. |
--max-session-turns | model.maxSessionTurns | Number of user/model/tool turns; pre-existing. Exits with code 53 on overrun (distinct from budget exit 55). |
--max-tool-calls counts top-level dispatches only. When the model calls the agent tool, the dispatch counts as 1; inner tool calls performed by the spawned subagent are not counted. A model that funnels work through subagents can do unbounded inner work under a small top-level budget. Combine with --exclude-tools agent if you need a tighter cap.structured_output is exempt from --max-tool-calls. Under --json-schema, the model's terminal structured_output call is the "I'm done" contract, not real work — it doesn't count against --max-tool-calls so a budget-edge completion isn't aborted as a false positive. The exemption is unconditional (including failed Ajv validations), so a model stuck in a malformed-output retry loop is NOT bounded by --max-tool-calls; combine with --max-session-turns or --max-wall-time to cap retries.structured_output is NOT exempt from --max-session-turns. That counter is pre-existing and bumps for every turn including the terminal contract. Size --max-session-turns to N+1 if you want to allow N real-work turns under --json-schema.--input-format stream-json: in stream-json input mode the daemon resets the budget counters at the start of every user message; the budget is per-message, not per-process.qwen serve / ACP sessions: the daemon ACP session path does NOT currently consult --max-wall-time / --max-tool-calls from settings.json. These budgets only apply to single-shot qwen -p runs and to --input-format stream-json sessions. (qwen serve does emit the YOLO-no-sandbox warning at boot if tools.approvalMode: 'yolo' is set in settings.)qwen -p "..." --yolo --max-session-turns N --max-wall-time 10m --output-format json. Pin a turn budget and a wall-clock budget so a stuck agent can't burn through your CI minutes, and capture --output-format json for post-run usage / tool-call auditing.--sandbox (or set QWEN_SANDBOX=1) so shell / write / edit tools run inside the sandbox image.QWEN_CODE_UNATTENDED_RETRY=1 with --max-wall-time. The retry env keeps the run alive past transient 429 / 529 responses; the wall-clock budget ensures a persistently-failing provider can't extend the job indefinitely.--max-tool-calls 25 caps how aggressively the model can grep / read. Combine with --exclude-tools shell,write,edit to make the bound meaningful.cat src/auth.py | qwen -p "Review this authentication code for security issues" > security-review.txt
result=$(git diff --cached | qwen -p "Write a concise commit message for these changes" --output-format json)
echo "$result" | jq -r '.response'
result=$(cat api/routes.js | qwen -p "Generate OpenAPI spec for these routes" --output-format json)
echo "$result" | jq -r '.response' > openapi.json
for file in src/*.py; do
echo "Analyzing $file..."
result=$(cat "$file" | qwen -p "Find potential bugs and suggest improvements" --output-format json)
echo "$result" | jq -r '.response' > "reports/$(basename "$file").analysis"
echo "Completed analysis for $(basename "$file")" >> reports/progress.log
done
result=$(git diff origin/main...HEAD | qwen -p "Review these changes for bugs, security issues, and code quality" --output-format json)
echo "$result" | jq -r '.response' > pr-review.json
grep "ERROR" /var/log/app.log | tail -20 | qwen -p "Analyze these errors and suggest root cause and fixes" > error-analysis.txt
result=$(git log --oneline v1.0.0..HEAD | qwen -p "Generate release notes from these commits" --output-format json)
response=$(echo "$result" | jq -r '.response')
echo "$response"
echo "$response" >> CHANGELOG.md
result=$(qwen -p "Explain this database schema" --include-directories db --output-format json)
total_tokens=$(echo "$result" | jq -r '.stats.models // {} | to_entries | map(.value.tokens.total) | add // 0')
models_used=$(echo "$result" | jq -r '.stats.models // {} | keys | join(", ") | if . == "" then "none" else . end')
tool_calls=$(echo "$result" | jq -r '.stats.tools.totalCalls // 0')
tools_used=$(echo "$result" | jq -r '.stats.tools.byName // {} | keys | join(", ") | if . == "" then "none" else . end')
echo "$(date): $total_tokens tokens, $tool_calls tool calls ($tools_used) used with models: $models_used" >> usage.log
echo "$result" | jq -r '.response' > schema-docs.md
echo "Recent usage trends:"
tail -5 usage.log
When Qwen Code runs in CI/CD pipelines or as a background daemon, a brief API outage (rate limiting or overload) should not kill a multi-hour task. Persistent retry mode makes Qwen Code retry transient API errors indefinitely until the service recovers.
Set the QWEN_CODE_UNATTENDED_RETRY environment variable to true or 1 (strict match, case-sensitive):
export QWEN_CODE_UNATTENDED_RETRY=1
[!important] Persistent retry requires an explicit opt-in.
CI=truealone does not activate it — silently turning a fast-fail CI job into an infinite-wait job would be dangerous. Always setQWEN_CODE_UNATTENDED_RETRYexplicitly in your pipeline configuration.
- name: Automated code review
env:
QWEN_CODE_UNATTENDED_RETRY: '1'
run: |
qwen -p "Review all files in src/ for security issues" \
--output-format json \
--yolo > review.json
export QWEN_CODE_UNATTENDED_RETRY=1
qwen -p "Migrate all callback-style functions to async/await in src/" --yolo
QWEN_CODE_UNATTENDED_RETRY=1 nohup qwen -p "Audit all dependencies for known CVEs" \
--output-format json > audit.json 2> audit.log &
During persistent retry, heartbeat messages are printed to stderr:
[qwen-code] Waiting for API capacity... attempt 3, retry in 45s
[qwen-code] Waiting for API capacity... attempt 3, retry in 15s
These messages keep CI runners alive and let you monitor progress. They do not appear in stdout, so JSON output piped to other tools remains clean.