Back to Ruflo

ADR-G019: First-Class Uncertainty

v3/@claude-flow/guidance/docs/adrs/ADR-G019-first-class-uncertainty.md

3.6.302.9 KB
Original Source

ADR-G019: First-Class Uncertainty

Status: Accepted Date: 2026-02-01 Author: Guidance Control Plane Team

Context

The existing memory and gate systems treat every value as equally certain. A memory entry written from a reliable API response has the same standing as one inferred from a single ambiguous log line. When agents act on low-confidence data, they produce confident-looking outputs that may be wrong. There is no way to express "I think this is true but I'm not sure" or "two pieces of evidence disagree."

Decision

Introduce UncertaintyLedger and UncertaintyAggregator:

Belief Tracking:

  • Each belief carries a claim, namespace, evidence array, and confidence interval (lower, point, upper)
  • Evidence is directional: supporting or opposing, each with a weight (0-1) and source
  • Status is derived from evidence ratios and confidence:
StatusCondition
confirmedconfidence >= 0.95, no opposing evidence
probableconfidence >= 0.8, opposing ratio < 0.3
uncertainconfidence >= 0.5, opposing ratio < 0.3
contestedopposing evidence ratio >= 0.3
refutedopposing evidence ratio >= 0.7
unknownno evidence

Confidence Mechanics:

  • recomputeConfidence(): point = supportingWeight / totalWeight, spread = 0.3 / sqrt(evidenceCount)
  • addEvidence() recomputes confidence and re-derives status automatically
  • decayAll(timestamp): confidence decays linearly over time at a configurable rate
  • isActionable(id): returns false if confidence.point < minConfidenceForAction threshold

Aggregation:

  • aggregate(ids): geometric mean of confidence points (penalizes low-confidence beliefs heavily)
  • worstCase(ids): minimum confidence across a set
  • bestCase(ids): maximum confidence across a set
  • anyContested(ids) / allConfirmed(ids): status-based queries

Inference Chains:

  • Beliefs can depend on other beliefs via dependsOn arrays
  • propagateUncertainty(id): propagates confidence drops through dependency chains
  • getInferenceChain(id): returns the full dependency graph for audit

Consequences

  • Agents can express and reason about uncertainty instead of treating everything as certain
  • Contested beliefs are surfaced automatically before they cause damage
  • Actionability gating prevents decisions on low-confidence data
  • Geometric mean aggregation ensures one weak belief drags down the whole set
  • Inference chains make it possible to trace why a belief is uncertain
  • 74 tests validate status transitions, evidence tracking, decay, aggregation, and inference chains

Alternatives Considered

  • Probability distributions per entry: Too heavy for the common case; confidence intervals are sufficient
  • Bayesian networks: Correct but requires a full probabilistic programming runtime
  • Simple confidence score (single float): Loses the interval and evidence trail; insufficient for contested detection