plans/2026-06-20-todo-ignore-unparseable-observer-output.md
Status: Not started. Spec for a follow-up cleanup pass. Owner principle (from @thedotmack): The observer parser should IGNORE any output it can't parse. That is the whole job. It must NOT inspect the content of unparseable output and act on it.
When the observer SDK returns something that isn't parseable <observation>/<summary> XML, the code is supposed to drop it. Instead, a substring classifier labels some of it poisoned and kills + respawns the SDK session. "Poisoned" is decided by a case-insensitive includes() against 11 hardcoded English phrases (session exhausted, context window, prompt is too long, this session has ended, …) — see src/sdk/output-classifier.ts:20-32.
This is a brittle keyword match standing in for "ignore what you can't parse," and it over-fires badly.
poisoned aborts as a share of all compression turns, by model:
| Model | poisoned % of turns | context |
|---|---|---|
| haiku (4.5) | 41.1% | 200K |
| claude-haiku-4-5-20251001 | 33.9% (9.0M turns) | 200K |
| claude-sonnet-4-5 | 17.7% | 200K–1M |
| claude-sonnet-4-6 | 9.6% | 200K–1M |
| claude-opus-4-7 | 5.9% | 200K |
| gemini-2.5-flash-lite | 0.1% | 1M |
| openai/gpt-oss-120b:free | 0.1% | — |
| gemini-3-flash-preview / xiaomi mimo-flash:free | ~0% | small |
Total ≈ 13.2M poisoned aborts / 30d ≈ 22% of all turns.
This is the inverse of real context exhaustion. If "poisoned" meant running out of room, the small-context models would top the list. Instead they're ~0% and the large-context Claude models are 6–41%. The classifier is matching Claude's closure-phrase wording (e.g. "I cannot continue this session"), not session health. Net effect: claude-mem kills+respawns ~1/3 of Haiku sessions on a keyword false positive, throwing away work and re-spending tokens.
(Note: the word "poisoned" is also just wrong — it implies tainted/corrupted input, not a wedged session. The whole concept should go.)
src/services/worker/agents/ResponseProcessor.ts
:56-120 — the !parsed.valid branch. The correct behavior already exists at :115-120 ("Plain-text skip responses are intentionally ignored" → confirmClaimedMessages drops the batch, returns).:75-77 — the offending trigger:
const mustRespawn =
outputClass === 'poisoned' || // ← remove
session.consecutiveInvalidOutputs >= INVALID_OUTPUT_RESPAWN_THRESHOLD; // ← see decision
:79-113 — the respawn + telemetry block. :26 — INVALID_OUTPUT_RESPAWN_THRESHOLD = 3.src/sdk/output-classifier.ts — POISONED_MARKERS (:20-32), poisoned in the ObserverOutputClass union (:13), the precedence check (:67-73).src/services/worker/SessionManager.ts — respawnPoisonedSession (:252), session.abortReason = 'poisoned' (:273).src/services/worker/http/routes/SessionRoutes.ts:37,44 — abort_reason enum mapping incl. poisoned.src/services/telemetry/scrub.ts:90-99 — enum doc comments + whitelisted keys invalid_output_class, abort_reason, respawn_triggered, consecutive_invalid_outputs.src/npx-cli/commands/telemetry.ts:71,74 — field docs listing poisoned.output-classifier tests + any ResponseProcessor respawn tests.output-classifier.ts — delete POISONED_MARKERS and the poisoned class. Classifier collapses to xml (parseable) vs not. Keep previewOutput() for log visibility only — it must never drive behavior.ResponseProcessor.ts — remove the whole mustRespawn/respawn block (:71-113). Unparseable output always falls through to the existing drop path (:115-120). Keep the warn-log + previewOutput so drops stay visible.SessionManager.respawnPoisonedSession — remove if it becomes dead code (confirm no other callers; respawnPoisonedSession is referenced from the Phase-2 buffer-flush guards — do NOT flush a rollup for it; just delete the call path cleanly).scrub.ts / SessionRoutes.ts / CLI docs — drop the poisoned enum value and, if respawn goes entirely, the now-unused respawn_triggered / invalid_output_class / abort_reason keys (or keep them whitelisted but unused — safer to keep keys, remove only the poisoned value from docs).Why pure-ignore is loop-safe: the drop path calls confirmClaimedMessages — the batch is NOT re-queued, so a genuinely-stuck session can't loop "until quota exhausted." The 3-strikes respawn was guarding a loop that the drop path already prevents.
Keep only the content-agnostic consecutiveInvalidOutputs >= 3 structural respawn; remove just the outputClass === 'poisoned' || clause and the POISONED_MARKERS list. This kills the 33–41% false positives but retains a structural circuit-breaker. (Owner leans toward pure-ignore above.)
bun test (output-classifier + ResponseProcessor + telemetry suites) green; tsc --noEmit clean.grep -rin "poison" src returns nothing behavioral (only history/comments if intentionally kept).invalid_output_class='poisoned' / abort_reason='poisoned' volume goes to zero on updated installs; Haiku turn throughput should rise (fewer needless respawns).The "Session abort reasons" and "Compression invalid-output classes" tiles on dashboard 1739781 will lose the poisoned slice as installs update — expected. No tile change needed; the decay itself is the confirmation signal.