docs/users/qwen-serve-deploy-local.md
qwen serve (v0.16-alpha)Reference templates for running qwen serve as a long-lived background process on a developer workstation. Pairs with the v0.16-alpha known limits — local-only, single-user, BYO bearer token. Containerized / multi-host / TLS-fronted deployments defer to v0.16.x.
Audience: dogfooding developers who want the daemon up across reboots, with logs going somewhere durable, and a clean
restart-on-failurestory. If you only need the daemon for the duration of a single shell session, plainqwen serve(foreground, Ctrl-C to stop) is fine.
openssl rand -hex 32 > ~/.qwen-serve-token # user-managed, NOT a built-in path
chmod 600 ~/.qwen-serve-token
export QWEN_SERVER_TOKEN="$(cat ~/.qwen-serve-token)"
The path / filename is yours to choose; v0.16-alpha does not auto-generate or auto-locate a token file (deferred to v0.16.x). See the Authentication section of the user guide for the canonical BYO setup.
Scope this
exportto the current shell session only. Don't add it to~/.bashrc/~/.zshrc— a profile-level export exposes the bearer token to every process spawned from that shell (IDE subprocesses, browser debuggers,npmscripts from unrelated projects). For long-running setups, use the systemdEnvironmentFile=/ launchdEnvironmentVariablesmechanisms below — both scope the token to just the daemon process.
The daemon reads the bearer token from either --token <value> on the CLI or the QWEN_SERVER_TOKEN env var (whitespace stripped from both). The TypeScript SDK's DaemonClient constructor falls back to QWEN_SERVER_TOKEN when no token option is passed (PR 27 fallback — clients with the env var set never need to thread the value through their script).
One shell-level export covers both server boot and SDK client construction (just keep it scoped to the session, per the note above).
Find your
qwenbinary first. The unit file'sExecStart=must hold an absolute path — service managers don't read your shell'sPATH. Runwhich qwento discover it. Common locations:/usr/local/bin/qwen(Linuxbrew, manual installs),~/.nvm/versions/node/vX.Y.Z/bin/qwen(nvm),~/.fnm/aliases/default/bin/qwen(fnm),~/.volta/bin/qwen(Volta). Substitute the actual path everywhere the templates below show/PATH/TO/qwen.
~/.config/systemd/user/qwen-serve.service:
[Unit]
Description=Qwen Code daemon (loopback HTTP + SSE)
After=network.target
[Service]
Type=simple
# Replace with your project; %h expands to $HOME under user units.
WorkingDirectory=%h/your-project
# Run `which qwen` to find the absolute path. systemd does NOT read $PATH.
ExecStart=/PATH/TO/qwen serve --hostname 127.0.0.1 --port 4170
# Read the bearer token from a chmod 600 file rather than inlining it
# in the unit. `Environment=` would expose the token in the unit file
# (typically 644 = world-readable). EnvironmentFile keeps the token in
# the user-owned secret file you already created with `chmod 600`.
EnvironmentFile=%h/.qwen-serve-token-env
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
Build the env file once (the token file from the setup step holds the raw value; this wraps it in KEY=value form so systemd reads it as an env assignment):
echo "QWEN_SERVER_TOKEN=$(cat ~/.qwen-serve-token)" > ~/.qwen-serve-token-env
chmod 600 ~/.qwen-serve-token-env
Manage:
systemctl --user daemon-reload
systemctl --user enable --now qwen-serve.service
loginctl enable-linger "$(whoami)" # keep the user manager running after logout / across reboot
journalctl --user -u qwen-serve -f # tail logs
systemctl --user restart qwen-serve.service # after token rotation
systemctl --user disable --now qwen-serve.service
Without loginctl enable-linger, the user-level systemd instance shuts down when the user logs out and only restarts on next login — on a headless dev box the daemon would not survive an SSH session ending. enable-linger is what makes "across reboots" actually work.
System-wide alternative (shared dev hosts, less common): drop the unit at /etc/systemd/system/[email protected] with User=%i, manage via sudo systemctl enable --now qwen-serve@<username>.service. Same [Service] body otherwise — but world-readable Environment= exposure is even more problematic at this level, so always use EnvironmentFile= pointing at the user's chmod 600 file. Pick user-level + linger for single-user workstations.
Find your
qwenbinary first. Same constraint as systemd —ProgramArgumentsmust hold an absolute path. Runwhich qwento discover it. Common locations on macOS:/opt/homebrew/bin/qwen(Homebrew on Apple Silicon),/usr/local/bin/qwen(Homebrew on Intel, manual installs),~/.nvm/versions/node/vX.Y.Z/bin/qwen(nvm),~/.volta/bin/qwen(Volta). Substitute below where the template shows/PATH/TO/qwen.
~/Library/LaunchAgents/com.qwenlm.qwen-serve.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.qwenlm.qwen-serve</string>
<key>ProgramArguments</key>
<array>
<!-- Run `which qwen` to find the absolute path; launchd does NOT read $PATH. -->
<string>/PATH/TO/qwen</string>
<string>serve</string>
<string>--hostname</string>
<string>127.0.0.1</string>
<string>--port</string>
<string>4170</string>
</array>
<!-- launchd does NOT expand `~` or `$HOME` — use absolute paths. -->
<key>WorkingDirectory</key>
<string>/Users/YOUR-USERNAME/your-project</string>
<key>EnvironmentVariables</key>
<dict>
<!-- DO NOT COMMIT this file with a real token. Also chmod 600 the
plist itself so the inlined token is not world-readable. -->
<key>QWEN_SERVER_TOKEN</key>
<string>PASTE-YOUR-TOKEN-HERE</string>
</dict>
<key>RunAtLoad</key>
<true/>
<!-- Restart only on non-zero exits (matches systemd Restart=on-failure).
A bare `<true/>` would respawn even after a clean SIGTERM, making
`kill <pid>` impossible to use as a stop signal — operator would
have to `launchctl unload`. SuccessfulExit=false fixes that. -->
<key>KeepAlive</key>
<dict>
<key>SuccessfulExit</key>
<false/>
</dict>
<!-- Throttle restart storms on persistent failures (mirrors systemd
RestartSec=5; launchd's default would respawn every <1s). -->
<key>ThrottleInterval</key>
<integer>10</integer>
<!-- Log into the user's Library, not /tmp. /tmp is world-writable
(symlink-attack risk on shared workstations) and gets cleaned by
periodic-daily after 3 days; `~/Library/Logs/qwen-serve/` is
user-scoped and survives. launchd truncates these on every
`load`, so the unload→load token-rotation cycle wipes prior
diagnostic logs — back them up if you need post-incident
inspection. -->
<key>StandardOutPath</key>
<string>/Users/YOUR-USERNAME/Library/Logs/qwen-serve/out.log</string>
<key>StandardErrorPath</key>
<string>/Users/YOUR-USERNAME/Library/Logs/qwen-serve/err.log</string>
</dict>
</plist>
Manage:
mkdir -p ~/Library/Logs/qwen-serve # first time only
chmod 600 ~/Library/LaunchAgents/com.qwenlm.qwen-serve.plist # plist holds the inline token
launchctl load ~/Library/LaunchAgents/com.qwenlm.qwen-serve.plist
launchctl unload ~/Library/LaunchAgents/com.qwenlm.qwen-serve.plist # to stop
tail -f ~/Library/Logs/qwen-serve/out.log ~/Library/Logs/qwen-serve/err.log
After editing the plist (e.g., rotating the token) you must unload then load again — launchctl does not auto-reload on plist changes the way systemd daemon-reload does. Note: each load truncates the log files, so save them off if you're investigating an incident before rotating.
Assumes QWEN_SERVER_TOKEN is already exported in your shell (see the setup section above):
tmux new -d -s qwen-serve "cd ~/your-project && qwen serve --hostname 127.0.0.1"
tmux attach -t qwen-serve # see live logs; Ctrl-b d to detach
tmux kill-session -t qwen-serve
tmux new -d inherits the parent shell's environment, so QWEN_SERVER_TOKEN flows through automatically. Best when you want to occasionally watch the daemon's stdout (auth warnings, MCP discovery progress, slow-client warnings) without committing to a service unit. Survives terminal close but not host reboot.
Assumes QWEN_SERVER_TOKEN is already exported in your shell:
nohup bash -c 'cd ~/your-project && qwen serve --hostname 127.0.0.1' > qwen-serve.log 2>&1 &
echo $! # daemon PID; capture if you want to `kill` cleanly later
The wrapping bash -c '...' ensures the daemon binds to ~/your-project rather than wherever you happened to run the command. Without that cd, qwen serve defaults to process.cwd() and a POST /session from a client expecting your project workspace returns 400 workspace_mismatch — silent foot-gun.
OK for one-off "let me run this in the background while I poke at the API" workflows. Not recommended for anything beyond a single session — no restart-on-crash, log file grows unbounded, no clean way to find the daemon if you forget the PID. Prefer tmux for interactive supervision or systemd / launchd for anything you want to outlast a reboot.
curl http://127.0.0.1:4170/health # → {"status":"ok"}
curl -H "Authorization: Bearer $QWEN_SERVER_TOKEN" \
http://127.0.0.1:4170/capabilities | jq .protocolVersions # daemon's feature set
When auth is configured (i.e., the daemon was started with --token / QWEN_SERVER_TOKEN set, OR --require-auth=true), every route except /health on loopback binds requires Authorization: Bearer <token>. If you started the daemon without a token on the loopback default (the qwen serve zero-config path), neither call requires a header. The templates above all configure a token, so the Authorization header is needed in practice. If /capabilities returns 401, the unit / plist token doesn't match the env-exported token your curl is using.
openssl rand -hex 32 > ~/.qwen-serve-token
chmod 600 ~/.qwen-serve-token
echo "QWEN_SERVER_TOKEN=$(cat ~/.qwen-serve-token)" > ~/.qwen-serve-token-env
chmod 600 ~/.qwen-serve-token-env
<string> value or re-export QWEN_SERVER_TOKEN. Don't forget chmod 600 on the plist if you regenerate it.)systemctl --user restart qwen-serve.servicelaunchctl unload ~/Library/LaunchAgents/com.qwenlm.qwen-serve.plist && launchctl load ~/Library/LaunchAgents/com.qwenlm.qwen-serve.plistkill <pid> then re-run with the new token in envDaemonClient reads QWEN_SERVER_TOKEN automatically (PR 27 fallback) — re-export the new value in any client shell and reconstruct the client.Service-manager restart semantics differ across the templates:
Restart=on-failure — restart only on non-zero exit / signal. A clean SIGTERM (systemctl stop) does not trigger a restart loop.KeepAlive with SuccessfulExit=false (the template above) — matches systemd behavior. A bare <true/> would have respawned even after a clean exit. ThrottleInterval=10 rate-limits restart storms on persistent failures, mirroring systemd's RestartSec=5.Within a single daemon process lifetime, client disconnects recover via SSE Last-Event-ID resume per the Durability model section of the user guide — the replay ring is in-memory.
A daemon restart drops all in-memory sessions; clients reconnect and start fresh. Cross-restart durability of session content (prompts, tool calls, conversation history) is NOT in v0.16-alpha.
1 daemon = 1 workspace × N sessions is enforced. Instance-path token keying + stale-token cleanup defer to v0.16.x.nssm, Service Control Manager wrapper) — for now use WSL2 and follow the systemd section above.See the v0.16-alpha known limits callout in the main user guide for the full deferred-features list, and #4175 for the v0.16-alpha rollout tracking issue.