Back to Qwen Code

Session Recap Design

docs/design/session-recap/session-recap-design.md

0.15.615.5 KB
Original Source

Session Recap Design

A brief (1-2 sentence) "where did I leave off" summary surfaced when the user returns to an idle session, either on demand (/recap) or after the terminal has been blurred for 5+ minutes.

Overview

When a user /resumes an old session days later, scrolling back through pages of history to remember what they were doing and what came next is a real friction point. Just reloading messages does not solve this UX problem.

The goal is to proactively surface a brief 1-2 sentence recap when the user returns:

  • High-level task (what they are doing) → next step (what to do next).
  • Visually distinct from real assistant replies, so it is never mistaken for new model output.
  • Best-effort: failures must be silent and never break the main flow.

Triggers

TriggerConditionsImplementation
ManualUser runs /recaprecapCommand.ts calls the same underlying service
AutoTerminal blurred (DECSET 1004 focus protocol) for ≥ 5 min + focus returns + stream is IdleuseAwaySummary.ts — 5min blur timer + useFocus event listener

Both paths funnel into a single function — generateSessionRecap() — to guarantee identical behavior. The auto-trigger is gated by general.showSessionRecap (default: off — explicit opt-in, so ambient LLM calls are never silently added to a user's bill); the manual command ignores that setting.

Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                          AppContainer.tsx                              │
│   isFocused = useFocus()                                               │
│   isIdle = streamingState === Idle                                     │
│       │                                                                │
│       ├─→ useAwaySummary({enabled, config, isFocused, isIdle,          │
│       │       │             addItem})                                  │
│       │       └─→ 5 min blur timer + idle/dedupe gates                 │
│       │              │                                                 │
│       │              ↓                                                 │
│       └─→ recapCommand (slash) ─→ generateSessionRecap(config, signal) │
│                                          │                             │
│                                          ↓                             │
│                              ┌─────────────────────────┐               │
│                              │ packages/core/services/ │               │
│                              │   sessionRecap.ts       │               │
│                              └─────────────────────────┘               │
│                                          │                             │
│                                          ↓                             │
│                              GeminiClient.generateContent              │
│                              (fastModel + tools:[])                    │
│                                                                        │
│   addItem({type: 'away_recap', text}) ─→ HistoryItemDisplay            │
│       └─ AwayRecapMessage rendered inline like any other history       │
│         item (※ + bold "recap: " + italic content, all dim);           │
│         scrolls naturally with the conversation. Mirrors Claude        │
│         Code's away_summary system message.                            │
└────────────────────────────────────────────────────────────────────────┘

Files

FileResponsibility
packages/core/src/services/sessionRecap.tsOne-shot LLM call + history filter + tag extraction
packages/cli/src/ui/hooks/useAwaySummary.tsAuto-trigger React hook
packages/cli/src/ui/commands/recapCommand.ts/recap manual entry point
packages/cli/src/ui/components/messages/StatusMessages.tsxAwayRecapMessage renderer ( + bold recap: + italic content, all dim)
packages/cli/src/ui/types.tsHistoryItemAwayRecap type
packages/cli/src/ui/components/HistoryItemDisplay.tsxDispatches away_recap history items to the renderer
packages/cli/src/config/settingsSchema.tsgeneral.showSessionRecap + general.sessionRecapAwayThresholdMinutes settings

Prompt Design

System Prompt

generationConfig.systemInstruction replaces the main agent's system prompt for this single call, so the model behaves only as a recap generator and not as a coding assistant.

Note that GeminiClient.generateContent() internally runs the prompt through getCustomSystemPrompt(), which appends the user's memory (QWEN.md / managed auto-memory) as a suffix. The final system prompt is therefore recap prompt + user memory — useful project context for the recap, not a leak.

Bullets below correspond 1:1 with RECAP_SYSTEM_PROMPT:

  • Under 40 words, 1-2 plain sentences (no markdown / lists / headings). For Chinese, treat the budget as roughly 80 characters total.
  • First sentence: the high-level task. Then: the concrete next step.
  • Explicitly forbid: listing what was done, reciting tool calls, status reports.
  • Match the dominant language of the conversation (English or Chinese).
  • Wrap output in <recap>...</recap>; nothing outside the tags.

Structured Output + Extraction

The model is instructed to wrap its answer in <recap>...</recap>:

<recap>Refactoring loopDetectionService.ts to address long-session OOM. Next step is to implement option B.</recap>

Why: some models (GLM family, reasoning models) write a "thinking" paragraph before the final answer. Returning the raw text would leak that reasoning into the UI.

extractRecap() has three fallback tiers:

  1. Both tags present: take what is between <recap>...</recap> (preferred).
  2. Only the open tag (e.g. maxOutputTokens truncated the close tag): take everything after the open tag.
  3. Tag missing entirely: return empty string → service returns null → UI renders nothing.

The third tier is "skip rather than show the wrong thing" — surfacing the model's reasoning preamble is worse than showing no recap at all.

Call Parameters

ParameterValueReason
modelgetFastModel() ?? getModel()Recap doesn't need a frontier model
tools[]One-shot query, no tool use
maxOutputTokens300Headroom for 1-2 short sentences + tags
temperature0.3Mostly deterministic, with a bit of natural variation
systemInstructionThe recap-only prompt aboveReplaces the main agent's role definition

History Filtering

geminiClient.getChat().getHistory() returns a Content[] that includes:

  • user / model text messages
  • model functionCall parts
  • user functionResponse parts (which can hold full file contents)
  • model thought parts (part.thought / part.thoughtSignature, the model's hidden reasoning)

filterToDialog() keeps only user / model parts that have non-empty text and are not thoughts. Two reasons:

  • Tool calls / responses: a single functionResponse can be 10K+ tokens. 30 such messages would drown the recap LLM in irrelevant detail, both wasting tokens and biasing the recap toward implementation noise like "called X tool to read Y file".
  • Thought parts: carry the model's internal reasoning. Including them risks treating hidden chain-of-thought as dialogue and surfacing it in the recap text.

After dropping empty messages, takeRecentDialog slices to the last 30 messages and refuses to start the slice on a dangling model/tool response.

Concurrency and Edge Cases

Auto-trigger hook state machine

useAwaySummary keeps three refs:

RefMeaning
blurredAtRefBlur start time (not cleared until focus returns)
recapPendingRefWhether an LLM call is in flight
inFlightRefThe current in-flight AbortController

useEffect deps: [enabled, config, isFocused, isIdle, addItem, thresholdMs].

EventAction
!enabled || !configAbort in-flight call + clear inFlightRef + clear blurredAtRef
!isFocused and blurredAtRef === nullSet blurredAtRef = Date.now()
isFocused and blurredAtRef === nullReturn early (no blur cycle to handle — first render or right after a brief-blur reset)
isFocused and blur duration < 5 minClear blurredAtRef, wait for next blur cycle
isFocused and blur ≥ 5 min and recapPendingRefReturn (dedupe)
isFocused and blur ≥ 5 min and !isIdlePreserve blurredAtRef and wait for the turn to finish (isIdle is in the deps, so the effect re-fires when streaming completes)
isFocused and blur ≥ 5 min and shouldFireRecap returns falseClear blurredAtRef and return — conversation hasn't moved enough since the last recap (≥ 2 user turns required, mirrors Claude Code)
isFocused and all conditions metClear blurredAtRef, set recapPendingRef = true, create AbortController, send the LLM request

The .then callback re-checks isIdleRef.current: if the user has started a new turn while the LLM was running, the late-arriving recap is dropped to avoid inserting it mid-turn.

The .finally clears recapPendingRef, and clears inFlightRef only if inFlightRef.current === controller (so it doesn't overwrite a newer controller).

A second useEffect aborts the in-flight controller on unmount.

/recap gating

CommandContext.ui.isIdleRef exposes the current stream state (mirroring the existing btwAbortControllerRef pattern). In interactive mode, recapCommand refuses when !isIdleRef.current or pendingItem !== null. pendingItem alone is insufficient because a normal model reply runs with streamingState === Responding and a null pendingItem.

Configuration and Model Selection

User-facing knobs

SettingDefaultNotes
general.showSessionRecapfalseAuto-trigger only. Manual /recap ignores this.
general.sessionRecapAwayThresholdMinutes5Minutes blurred before auto-recap fires on focus-in. Matches Claude Code's default.
fastModelunsetRecommended (e.g. qwen3-coder-flash) for fast and cheap recaps.

Model fallback

config.getFastModel() ?? config.getModel():

  • User has a fastModel set and it is valid for the current auth type → use fastModel.
  • Otherwise → fall back to the main session model (works, just costlier and slower).

Observability

createDebugLogger('SESSION_RECAP') emits:

  • caught exceptions from the recap path (debugLogger.warn).

All failures are fully transparent to the user — recap is an auxiliary feature and never throws into the UI. Developers can grep for the [SESSION_RECAP] tag in the debug log file: written by default to ~/.qwen/debug/<sessionId>.txt (latest.txt symlinks to the current session); disable via QWEN_DEBUG_LOG_FILE=0.

Out of Scope

ItemWhy not
Progress UI for /recap (spinner / pendingItem)3-5 second wait is tolerable; adds complexity.
Automated testsService is small (~150 lines), end-to-end tested manually first; unit tests can land in a separate PR.
Localized promptsThe system prompt is for the model; English is the most reliable substrate. The model selects the output language from the conversation.
QWEN_CODE_ENABLE_AWAY_SUMMARY env varClaude Code uses it to keep the feature on when telemetry is disabled; Qwen Code's current telemetry model doesn't need this.
Auto-recap on /resume completionA natural follow-up but needs a hook point in useResumeCommand; out of scope for this PR.