Back to Mcpproxy Go

Output Sanitisation (Spec 054 Track B)

docs/features/output-sanitisation.md

0.38.12.8 KB
Original Source

Output Sanitisation (Spec 054 Track B)

mcpproxy contains untrusted tool output before it reaches your agent. It builds on the existing content-trust classification (Spec 035, derived from each tool's openWorldHint) and the secret detector (Spec 026), enforcing them at the single response chokepoint instead of only logging.

What it does

BehaviourDefaultTrigger
Spotlight untrusted text in source-identifying delimiters (lossless)offspotlight_untrusted: true
Redact detected secrets → [REDACTED:<category>]offresponse_action: redact
Strip control sequences (ANSI / C0-C1 / zero-width / bidi) on untrusted textoffstrip_control_chars: true
Block the whole response on a critical detectionoffresponse_action: block

The feature is fully opt-in: with no output_sanitisation block (or all options at their defaults) mcpproxy forwards every tool response byte-for-byte — identical to pre-feature behaviour. Each behaviour above is enabled independently. Non-text blocks (image / audio / embedded resource) are never modified.

Spotlighting wraps untrusted text as:

«untrusted:server/tool»
<tool output, with « and » escaped so it cannot forge the delimiter>
«/untrusted:server/tool»

Configuration

json
{
  "output_sanitisation": {
    "spotlight_untrusted": true,             // opt into spotlighting untrusted output
    "response_action": "spotlight",          // "spotlight" | "redact" | "block"
    "strip_control_chars": false,
    "strip_classes": ["ansi", "c0c1", "bidi", "zero_width"],
    "max_redactions": 100
  }
}

Omit the block entirely (or leave spotlight_untrusted: false) and mcpproxy does nothing — fully backward-compatible. Set spotlight_untrusted: true for the lossless wrapper, response_action: redact to mask secrets, or response_action: block to drop critical responses.

Auditing

Redact / strip / block actions emit a policy_decision activity record with a descriptive reason (visible via mcpproxy activity list and the Web UI Activity Log). Pure spotlighting is logged at debug level only. The activity log stores the sanitised response, so redacted secrets are not persisted in the audit trail.

Redaction, stripping, and blocking run before truncation and the read_cache write, so a large response paginated through read_cache carries the same sanitisation as the agent-facing copy (a blocked response is never cached at all). The lossless spotlight wrapper is applied after truncation and is not cached (the cache holds raw, paginable records).

See also: sensitive-data-detection.md (the detector reused for redaction/block) and Spec 056 (output-schema validation, Track A) which shares the same chokepoint.