v3/@claude-flow/guidance/docs/adrs/ADR-G004-four-enforcement-gates.md
Accepted
2026-02-01
Claude Code can execute arbitrary tool calls: bash commands, file edits, file writes, MCP tool invocations, and task spawns. When operating autonomously (in swarms, long daemon sessions, or headless mode), the model may:
rm -rf /, git push --force origin main, DROP DATABASE)The model's adherence to CLAUDE.md rules is probabilistic. Rules in the context window are suggestions -- the model can and does ignore them, especially in long sessions where attention degrades. We need gates that are synchronous, mandatory, and non-bypassable by the model.
The gates must be configurable (teams have different risk tolerances), produce structured results (for ledger logging), and reference active guidance rules (for traceability).
Implement exactly four enforcement gates in the EnforcementGates class (src/gates.ts), each covering a distinct high-risk category:
evaluateDestructiveOps)Trigger: Regex match against destructivePatterns in the GateConfig. Default patterns:
/\brm\s+-rf?\b/i,
/\bdrop\s+(database|table|schema|index)\b/i,
/\btruncate\s+table\b/i,
/\bgit\s+push\s+.*--force\b/i,
/\bgit\s+reset\s+--hard\b/i,
/\bgit\s+clean\s+-fd?\b/i,
/\bformat\s+[a-z]:/i,
/\bdel\s+\/[sf]\b/i,
/\b(?:kubectl|helm)\s+delete\s+(?:--all|namespace)\b/i,
/\bDROP\s+(?:DATABASE|TABLE|SCHEMA)\b/i,
/\bDELETE\s+FROM\s+\w+\s*$/i,
/\bALTER\s+TABLE\s+\w+\s+DROP\b/i,
Decision: require-confirmation. The operation is not blocked outright but requires explicit human confirmation and a documented rollback plan.
Remediation: The gate response includes three-step remediation: confirm intention, document rollback plan, ensure migration has a down step.
evaluateToolAllowlist)Trigger: Tool name not found in allowedTools array. Supports exact match, wildcard prefix (mcp_*), and universal wildcard (*).
Decision: block. Unapproved tools are blocked entirely.
Default state: Disabled (toolAllowlist: false). When enabled with an explicit allow list, only listed tools can be used. This is intended for high-security environments.
evaluateDiffSize)Trigger: diffLines > diffSizeThreshold (default: 300 lines).
Decision: warn. The operation proceeds but the model is instructed to create a plan, stage changes incrementally, run tests after each stage, and consider splitting into multiple PRs.
Rationale for warn vs. block: Large diffs are not inherently dangerous; they are a code smell. Blocking would prevent legitimate refactoring. The warning ensures the model is aware and plans accordingly.
evaluateSecrets)Trigger: Regex match against secretPatterns in content. Default patterns cover:
api_key=, apikey=)password=, secret=)sk-* (Anthropic/OpenAI), ghp_* (GitHub), npm_* (npm), AKIA* (AWS)Decision: block. Secrets must never be committed or exposed.
Redaction: Detected secrets are partially redacted in the gate result (first 4 chars + asterisks + last 4 chars) to aid debugging without exposing the full value.
The aggregateDecision() method returns the most restrictive decision across all gate results using a severity hierarchy: block (3) > require-confirmation (2) > warn (1) > allow (0).
Three entry points invoke gates in the appropriate combination:
| Entry Point | Gates Invoked |
|---|---|
evaluateCommand(command) | destructive-ops, secrets |
evaluateToolUse(toolName, params) | tool-allowlist, secrets |
evaluateEdit(filePath, content, diffLines) | diff-size, secrets |
Each entry point returns an array of GateResult objects, allowing callers to inspect individual gate decisions.
block decision.gateName, reason, triggeredRules, remediation, and metadata. This feeds directly into the run ledger.destructiveOps: false), adjust thresholds (diffSizeThreshold: 500), add custom patterns, or define allowlists -- all via GateConfig.secretPatterns and add exclusion patterns.rm -rf ./tmp/cache (safe) from rm -rf / (catastrophic). Both match the pattern. Mitigation: the gate returns require-confirmation rather than block, allowing the human to approve safe operations.Add gates for code style, naming conventions, import ordering. Rejected because soft-preference rules do not warrant synchronous blocking. They belong in the retrieval layer (shards) or post-hoc evaluation (ledger evaluators), not in gates.
Only block secrets, warn for everything else. Rejected because destructive operations have irreversible consequences. A rm -rf after the fact cannot be undone by a warning.
Send the command to a model and ask "is this dangerous?" Rejected because it adds latency (200-500ms per gate check), is non-deterministic, and could itself be manipulated by prompt injection in the command being evaluated.
Assign a risk score and let the model decide whether to proceed. Rejected because the entire point of gates is to remove the model's agency over high-risk decisions. A probability threshold that the model can reason about is no better than a rule in the context window.
v3/@claude-flow/guidance/src/gates.ts -- EnforcementGates class, GateConfig, default patternsv3/@claude-flow/guidance/src/types.ts -- GateDecision, GateResult, GateConfigv3/@claude-flow/guidance/src/index.ts -- GuidanceControlPlane.evaluateCommand(), evaluateToolUse(), evaluateEdit()