.qwen/skills/tmux-real-user-testing/SKILL.md
Run Qwen Code in a real tmux TUI session as a user would: navigate dialogs, trigger slash commands, exercise workflows, and save a readable log that maintainers can review. Prefer this workflow when the goal is not just a pass/fail assertion, but a narrative artifact showing what happened on screen.
Use tmux as a real-use harness. Drive the TUI with realistic keyboard actions,
then save a step-by-step readable transcript with tmux capture-pane -p after
each meaningful state change.
Avoid relying on tmux pipe-pane as the primary report. pipe-pane captures raw
ANSI/control streams from React Ink TUI output and often looks like garbled text
when opened as plain text. Use pipe-pane only as an optional forensic artifact. Make
tmux-readable-full.log the main deliverable.
Use this workflow for:
/auth, /model, /manage-models, MCP setup,
permissions, onboarding, or interactive error recovery.Use headless JSON E2E instead when only tool execution or model API behavior needs structured assertions.
Create a timestamped directory under project tmp/:
tmp/<scenario>-tmux-YYYYMMDD-HHMMSS/
├── tmux-readable-full.log # primary report: step-by-step readable snapshots
├── tmux-final-capture.log # final screen only
├── current-pane.txt # latest poll/snapshot scratch file
└── report.md # short summary with result and artifact pointers
Do not overwrite previous runs. Preserve complete logs unless the user explicitly asks to sanitize or trim them.
Use scripts/tmux-real-user-log.sh to avoid rewriting shell glue. The script can
start a session, append labeled snapshots, send keys, wait for text, and finish.
The start command outputs export statements — use eval to set the variables
directly in your shell:
eval "$(bash .qwen/skills/tmux-real-user-testing/scripts/tmux-real-user-log.sh \
start <scenario> . npm run dev -- --approval-mode yolo)"
# → $SESSION, $OUTDIR, $LOG are now available
Show the full usage before running a new scenario:
bash .qwen/skills/tmux-real-user-testing/scripts/tmux-real-user-log.sh help
Use a large tmux viewport so dialogs render fully. Wait for the TUI to render before interacting — poll for a known startup string rather than blind sleeping:
TS=$(date +%Y%m%d-%H%M%S)
PROJECT_ROOT="$(pwd)"
OUT="$PROJECT_ROOT/tmp/<scenario>-$TS"
SESSION="<scenario>-$TS"
mkdir -p "$OUT"
tmux new-session -d -s "$SESSION" -x 200 -y 50 \
-c "$PROJECT_ROOT" \
"npm run dev -- --approval-mode yolo"
# Poll until TUI is ready (adjust regex to match your app's startup line)
for i in $(seq 1 30); do
sleep 1
if tmux capture-pane -t "$SESSION" -p -S -100 | grep -q "Ready\|>"; then
break
fi
done
Use node dist/cli.js instead of npm run dev only when verifying a built
bundle. Use the globally installed qwen only when reproducing a user-reported
installed-version bug.
After each meaningful action, append a section header plus capture-pane -p to
the full log:
LOG="$OUT/tmux-readable-full.log"
{
printf '\n===== 01 /auth dialog =====\n'
tmux capture-pane -t "$SESSION" -p -S -240
} >> "$LOG"
Increase -S as the session grows (add ~100 lines per section). The important
part is that each section is a rendered frame, not raw ANSI output.
Split typing and Enter to avoid swallowed submissions:
tmux send-keys -t "$SESSION" "/auth"
sleep 0.5
tmux send-keys -t "$SESSION" Enter
sleep 2
For navigation:
tmux send-keys -t "$SESSION" Down
tmux send-keys -t "$SESSION" Space
tmux send-keys -t "$SESSION" Escape
For text input into Ink fields, prefer one key at a time if bulk text is ignored:
tmux send-keys -t "$SESSION" e n a b l e d
Use text on the screen as the completion condition. On timeout, dump the current pane so the log shows what was on screen when the wait expired:
for i in $(seq 1 60); do
sleep 2
tmux capture-pane -t "$SESSION" -p -S -400 > "$OUT/current-pane.txt"
if grep -q "Successfully configured\|Error\|failed" \
"$OUT/current-pane.txt"; then
break
fi
done
# Always append the final poll result (match or timeout) to the log
{
printf '\n===== 04 auth result =====\n'
cat "$OUT/current-pane.txt"
} >> "$LOG"
Capture the final screen, append it, then kill the session:
tmux capture-pane -t "$SESSION" -p -S -10000 > "$OUT/tmux-final-capture.log"
{
printf '\n===== final capture before cleanup =====\n'
cat "$OUT/tmux-final-capture.log"
} >> "$LOG"
tmux kill-session -t "$SESSION"
Write report.md with:
tmux-readable-full.log marked as the primary log.Keep assertions tied to evidence in the log. Prefer phrases like “log section
07 toggle model on shows 16 enabled” over unsupported summaries.
A good scenario is a linear sequence of observable state transitions. Design it as a series of steps where each step produces visible TUI output you can capture:
wait-for polling.For each step, define:
/auth, Down, Enter, etc.)Successfully configured, Error, Saved)HELPER=.qwen/skills/tmux-real-user-testing/scripts/tmux-real-user-log.sh
# Start
eval "$(bash "$HELPER" start auth-test . npm run dev -- --approval-mode yolo)"
# → prints SESSION=... OUTDIR=...
# Trigger /auth, navigate to OAuth provider
bash "$HELPER" type-submit "$SESSION" /auth
bash "$HELPER" snapshot "$SESSION" "$OUTDIR" "01 auth menu"
bash "$HELPER" send "$SESSION" Down Down Enter
bash "$HELPER" snapshot "$SESSION" "$OUTDIR" "02 provider selected"
# Wait for OAuth flow to complete (may involve browser interaction)
bash "$HELPER" wait-for "$SESSION" "$OUTDIR" "Successfully configured|Error|failed"
bash "$HELPER" snapshot "$SESSION" "$OUTDIR" "03 auth result"
# Finish
bash "$HELPER" finish "$SESSION" "$OUTDIR"
For flows involving browser OAuth callbacks, the wait-for poll will catch the
result after the user completes the browser step. If the flow requires the LLM to
open the browser itself, note that side effect in the scenario design.
Ask before deleting logs or reverting settings. Do not sanitize by default if the user explicitly requests complete logs. If logs may be shared externally, offer a separate sanitized copy rather than modifying the original.
Mention likely side effects before starting: OAuth may open a browser, write Qwen settings, set API key config, and update model provider entries.
open <file> on macOS producing no terminal output is normal; it launches the
file in the associated app.tmux-final-capture.log contains only the last screen; it is not the full
journey.tmux-readable-full.log is the report-grade artifact.tmux pipe-pane raw logs can contain ANSI control sequences and look garbled.capture-pane records current rendered state, not transient flicker.