docs/design/session-recap/session-recap-design.md
A brief (1-2 sentence) "where did I leave off" summary surfaced when the user returns to an idle session, either on demand (
/recap) or after the terminal has been blurred for 5+ minutes.
When a user /resumes an old session days later, scrolling back through
pages of history to remember what they were doing and what came next
is a real friction point. Just reloading messages does not solve this
UX problem.
The goal is to proactively surface a brief 1-2 sentence recap when the user returns:
| Trigger | Conditions | Implementation |
|---|---|---|
| Manual | User runs /recap | recapCommand.ts calls the same underlying service |
| Auto | Terminal blurred (DECSET 1004 focus protocol) for ≥ 5 min + focus returns + stream is Idle | useAwaySummary.ts — 5min blur timer + useFocus event listener |
| Daemon HTTP | Remote client calls POST /session/:id/recap | server.ts route → bridge.generateSessionRecap (ext-method roundtrip) → acpAgent.ts calls generateSessionRecap(session.getConfig(), signal) |
All three paths funnel into the same generateSessionRecap() function
in core/services/sessionRecap.ts to guarantee identical behavior. The
auto-trigger is gated by general.showSessionRecap (default: off —
explicit opt-in, so ambient LLM calls are never silently added to a
user's bill); the manual command and daemon HTTP route ignore that
setting (the caller is making an explicit request).
The daemon route is non-strict-gated (mirrors /session/:id/prompt's
posture — recap costs tokens but mutates no state). Capability tag
session_recap advertises the route on /capabilities.features. SDK
helpers: DaemonClient.recapSession(sessionId, opts) and
DaemonSessionClient.recap(opts). See
docs/developers/qwen-serve-protocol.md § POST /session/:id/recap
for the wire contract and error envelope.
Cancellation is absent in v1. The route does not listen for HTTP
client disconnect, no AbortSignal is threaded into
bridge.generateSessionRecap, and the ACP child handler passes a
never-aborting AbortController().signal to the core helper (no
cross-process abort plumbing yet). The only ceilings are the bridge's
60s SESSION_RECAP_TIMEOUT_MS backstop and the transport-closed race
against ACP channel death. Wiring an HTTP-side AbortController in
isolation would be cosmetic — the child-side LLM call would still run
to completion, so e2e cancel is not achievable without the cross-
process abort piece. This is acceptable for v1 because recap is short
(single-attempt side-query, maxOutputTokens: 300, ~1–5s typical).
A future request-id-based cancel ext-method can plumb full end-to-end
cancellation if/when the bandwidth cost justifies it.
┌────────────────────────────────────────────────────────────────────────┐
│ AppContainer.tsx │
│ isFocused = useFocus() │
│ isIdle = streamingState === Idle │
│ │ │
│ ├─→ useAwaySummary({enabled, config, isFocused, isIdle, │
│ │ │ addItem}) │
│ │ └─→ 5 min blur timer + idle/dedupe gates │
│ │ │ │
│ │ ↓ │
│ └─→ recapCommand (slash) ─→ generateSessionRecap(config, signal) │
│ │ │
│ ↓ │
│ ┌─────────────────────────┐ │
│ │ packages/core/services/ │ │
│ │ sessionRecap.ts │ │
│ └─────────────────────────┘ │
│ │ │
│ ↓ │
│ GeminiClient.generateContent │
│ (fastModel + tools:[]) │
│ │
│ addItem({type: 'away_recap', text}) ─→ HistoryItemDisplay │
│ └─ AwayRecapMessage rendered inline like any other history │
│ item (※ + bold "recap: " + italic content, all dim); │
│ scrolls naturally with the conversation. Mirrors Claude │
│ Code's away_summary system message. │
└────────────────────────────────────────────────────────────────────────┘
| File | Responsibility |
|---|---|
packages/core/src/services/sessionRecap.ts | One-shot LLM call + history filter + tag extraction |
packages/cli/src/ui/hooks/useAwaySummary.ts | Auto-trigger React hook |
packages/cli/src/ui/commands/recapCommand.ts | /recap manual entry point |
packages/cli/src/ui/components/messages/StatusMessages.tsx | AwayRecapMessage renderer (※ + bold recap: + italic content, all dim) |
packages/cli/src/ui/types.ts | HistoryItemAwayRecap type |
packages/cli/src/ui/components/HistoryItemDisplay.tsx | Dispatches away_recap history items to the renderer |
packages/cli/src/config/settingsSchema.ts | general.showSessionRecap + general.sessionRecapAwayThresholdMinutes settings |
generationConfig.systemInstruction replaces the main agent's system
prompt for this single call, so the model behaves only as a recap
generator and not as a coding assistant.
Note that GeminiClient.generateContent() internally runs the prompt
through getCustomSystemPrompt(), which appends the user's memory
(QWEN.md / managed auto-memory) as a suffix. The final system prompt is
therefore recap prompt + user memory — useful project context for the
recap, not a leak.
Bullets below correspond 1:1 with RECAP_SYSTEM_PROMPT:
<recap>...</recap>; nothing outside the tags.The model is instructed to wrap its answer in <recap>...</recap>:
<recap>Refactoring loopDetectionService.ts to address long-session OOM. Next step is to implement option B.</recap>
Why: some models (GLM family, reasoning models) write a "thinking" paragraph before the final answer. Returning the raw text would leak that reasoning into the UI.
extractRecap() has three fallback tiers:
<recap>...</recap> (preferred).maxOutputTokens truncated the close tag):
take everything after the open tag.null
→ UI renders nothing.The third tier is "skip rather than show the wrong thing" — surfacing the model's reasoning preamble is worse than showing no recap at all.
| Parameter | Value | Reason |
|---|---|---|
model | getFastModel() ?? getModel() | Recap doesn't need a frontier model |
tools | [] | One-shot query, no tool use |
maxOutputTokens | 300 | Headroom for 1-2 short sentences + tags |
temperature | 0.3 | Mostly deterministic, with a bit of natural variation |
systemInstruction | The recap-only prompt above | Replaces the main agent's role definition |
geminiClient.getChat().getHistory() returns a Content[] that
includes:
user / model text messagesmodel functionCall partsuser functionResponse parts (which can hold full file contents)model thought parts (part.thought / part.thoughtSignature,
the model's hidden reasoning)filterToDialog() keeps only user / model parts that have non-empty
text and are not thoughts. Two reasons:
functionResponse can be 10K+
tokens. 30 such messages would drown the recap LLM in irrelevant
detail, both wasting tokens and biasing the recap toward
implementation noise like "called X tool to read Y file".After dropping empty messages, takeRecentDialog slices to the last 30
messages and refuses to start the slice on a dangling model/tool
response.
useAwaySummary keeps three refs:
| Ref | Meaning |
|---|---|
blurredAtRef | Blur start time (not cleared until focus returns) |
recapPendingRef | Whether an LLM call is in flight |
inFlightRef | The current in-flight AbortController |
useEffect deps: [enabled, config, isFocused, isIdle, addItem, thresholdMs].
| Event | Action |
|---|---|
!enabled || !config | Abort in-flight call + clear inFlightRef + clear blurredAtRef |
!isFocused and blurredAtRef === null | Set blurredAtRef = Date.now() |
isFocused and blurredAtRef === null | Return early (no blur cycle to handle — first render or right after a brief-blur reset) |
isFocused and blur duration < 5 min | Clear blurredAtRef, wait for next blur cycle |
isFocused and blur ≥ 5 min and recapPendingRef | Return (dedupe) |
isFocused and blur ≥ 5 min and !isIdle | Preserve blurredAtRef and wait for the turn to finish (isIdle is in the deps, so the effect re-fires when streaming completes) |
isFocused and blur ≥ 5 min and shouldFireRecap returns false | Clear blurredAtRef and return — conversation hasn't moved enough since the last recap (≥ 2 user turns required, mirrors Claude Code) |
isFocused and all conditions met | Clear blurredAtRef, set recapPendingRef = true, create AbortController, send the LLM request |
The .then callback re-checks isIdleRef.current: if the user has
started a new turn while the LLM was running, the late-arriving recap
is dropped to avoid inserting it mid-turn.
The .finally clears recapPendingRef, and clears inFlightRef only
if inFlightRef.current === controller (so it doesn't overwrite a
newer controller).
A second useEffect aborts the in-flight controller on unmount.
/recap gatingCommandContext.ui.isIdleRef exposes the current stream state
(mirroring the existing btwAbortControllerRef pattern). In
interactive mode, recapCommand refuses when !isIdleRef.current
or pendingItem !== null. pendingItem alone is insufficient
because a normal model reply runs with streamingState === Responding
and a null pendingItem.
| Setting | Default | Notes |
|---|---|---|
general.showSessionRecap | false | Auto-trigger only. Manual /recap ignores this. |
general.sessionRecapAwayThresholdMinutes | 5 | Minutes blurred before auto-recap fires on focus-in. Matches Claude Code's default. |
fastModel | unset | Recommended (e.g. qwen3-coder-flash) for fast and cheap recaps. |
config.getFastModel() ?? config.getModel():
fastModel set and it is valid for the current auth type
→ use fastModel.createDebugLogger('SESSION_RECAP') emits:
debugLogger.warn).All failures are fully transparent to the user — recap is an
auxiliary feature and never throws into the UI. Developers can grep for
the [SESSION_RECAP] tag in the debug log file: written by default to
~/.qwen/debug/<sessionId>.txt (latest.txt symlinks to the current
session); disable via QWEN_DEBUG_LOG_FILE=0.
| Item | Why not |
|---|---|
Progress UI for /recap (spinner / pendingItem) | 3-5 second wait is tolerable; adds complexity. |
| Automated tests | Service is small (~150 lines), end-to-end tested manually first; unit tests can land in a separate PR. |
| Localized prompts | The system prompt is for the model; English is the most reliable substrate. The model selects the output language from the conversation. |
QWEN_CODE_ENABLE_AWAY_SUMMARY env var | Claude Code uses it to keep the feature on when telemetry is disabled; Qwen Code's current telemetry model doesn't need this. |
Auto-recap on /resume completion | A natural follow-up but needs a hook point in useResumeCommand; out of scope for this PR. |