v3/@claude-flow/guidance/docs/adrs/ADR-G024-continue-gate.md
Status: Accepted Date: 2026-02-01 Author: Guidance Control Plane Team
Existing gates are tool-centric: PreToolUse, PreCommand, PreEdit. They evaluate individual actions. But long-run failures are rarely a single bad tool call. They are internally generated loops where the agent keeps going — redoing work, burning tokens, drifting from the goal — without any individual step being obviously wrong. There is no gate for "should this agent continue at all?"
The CoherenceScheduler (G015) throttles privilege based on accumulated violations, but it does not evaluate next-step intent. The EconomicGovernor tracks budget consumption but does not detect acceleration. Neither checks whether the agent is stuck in a rework loop.
Introduce ContinueGate — a step-level gate that evaluates whether a long-running agent should proceed to its next step.
Decision Types (priority order):
| Decision | Trigger | Effect |
|---|---|---|
stop | Coherence below threshold, step limit, budget exhausted | Halt immediately |
pause | Rework ratio > 30%, uncertainty > 80% | Stop, await human review |
throttle | Budget slope accelerating > 2%/step | Slow down, insert delays |
checkpoint | N steps since last checkpoint | Save state before continuing |
continue | All checks pass | Proceed normally |
Evaluation Inputs (StepContext):
stepNumber, totalTokensUsed, totalToolCallsreworkCount — steps that redo previous workcoherenceScore — from CoherenceScheduler (0–1)uncertaintyScore — from UncertaintyAggregator (0–1)budgetRemaining — tokens, tool calls, timelastCheckpointStep — step number of last checkpointBudget Slope Detection:
Linear regression over the last 10 steps of token usage. If the slope is positive and exceeds maxBudgetSlopePerStep (default 0.02), the agent is accelerating spend — typically a sign of retry loops or expanding context.
Defaults:
maxConsecutiveSteps: 100checkpointIntervalSteps: 25minCoherenceForContinue: 0.4maxUncertaintyForContinue: 0.8maxReworkRatio: 0.3cooldownMs: 5000The gate maintains an evaluation history (max 10,000 entries) and provides aggregate statistics for monitoring.