docs/source/concepts/signals.rst
.. -- coding: utf-8 --
Agentic Signals are lightweight, model-free behavioral indicators computed from live interaction trajectories and attached to your existing OpenTelemetry traces. They are the instrumentation layer of a closed-loop improvement flywheel for agents — turning raw production traffic into prioritized data that can drive prompt, routing, and model updates without running an LLM-as-judge on every session.
The framework implemented here follows the taxonomy and detector design in
Signals: Trajectory Sampling and Triage for Agentic Interactions
(Chen et al., 2026 <https://arxiv.org/abs/2604.00356>_). All detectors
are computed without model calls; the entire pipeline attaches structured
attributes and span events to existing spans so your dashboards and alerts
work unmodified.
Agentic applications are increasingly deployed at scale, yet improving them after deployment remains difficult. Production trajectories are long, numerous, and non-deterministic, making exhaustive human review infeasible and auxiliary LLM evaluation expensive. As a result, teams face a bottleneck: they cannot score every response, inspect every trace, or reliably identify which failures and successes should inform the next model update. Without a low-cost triage layer, the feedback loop from production behavior to model improvement remains incomplete.
Signals close this loop by cheaply identifying which interactions among millions are worth inspecting:
\tau-bench, compared with 54%
for random sampling, yielding a 1.52× efficiency gain per informative
trajectory.This loop depends on the first step being nearly free. The framework is
therefore designed around fixed-taxonomy, model-free detectors with
:math:O(\text{messages}) cost, no online behavior change, and no
dependence on expensive evaluator models. By making production traces
searchable and sampleable at scale, signals turn raw agent telemetry into a
practical model-optimization flywheel.
Behavioral signals are canaries in the coal mine — early, objective indicators that something may have gone wrong (or gone exceptionally well). They don't explain why an agent failed, but they reliably signal where attention is needed.
These signals emerge naturally from the rhythm of interaction:
Individually, these clues are shallow; together, they form a fingerprint of agent performance. Embedded directly into traces, they make it easy to spot friction as it happens: where users struggle, where agents loop, where tool failures cluster, and where escalations occur.
Signals are organized into three top-level layers, each with its own
intent. Every detected signal belongs to exactly one leaf type under one of
seven categories. The per-category summaries and leaf-type descriptions
below are borrowed verbatim from the reference implementation at
katanemo/signals <https://github.com/katanemo/signals>_ to keep the
documentation and the detector contract in sync.
Misalignment — Misalignment signals capture semantic or intent mismatch between the user and the agent, such as rephrasing, corrections, clarifications, and restated constraints. These signals do not assert that either party is "wrong"; they only indicate that shared understanding has not yet been established.
.. list-table:: :header-rows: 1 :widths: 30 70
misalignment.correctionmisalignment.rephrasemisalignment.clarificationStagnation — Stagnation signals capture cases where the discourse continues but fails to make visible progress. This includes near-duplicate assistant responses, circular explanations, repeated scaffolding, and other forms of linguistic degeneration.
.. list-table:: :header-rows: 1 :widths: 30 70
stagnation.draggingstagnation.repetitionDisengagement — Disengagement signals mark the withdrawal of cooperative intent from the interaction. These include explicit requests to exit the agent flow (e.g., "talk to a human"), strong negative stances, and abandonment markers.
.. list-table:: :header-rows: 1 :widths: 30 70
disengagement.escalationdisengagement.quitdisengagement.negative_stanceSatisfaction — Satisfaction signals indicate explicit stabilization and completion of the interaction. These include expressions of gratitude, success confirmations, and closing utterances. We use these signals to sample exemplar traces rather than to assign quality scores.
.. list-table:: :header-rows: 1 :widths: 30 70
satisfaction.gratitudesatisfaction.confirmationsatisfaction.successFailure — Detects agent-caused failures in tool/function usage. These
are issues the agent is responsible for (as opposed to environment failures
which are external system issues). Requires tool-call traces
(function_call / observation) to fire.
.. list-table:: :header-rows: 1 :widths: 30 70
execution.failure.invalid_argsexecution.failure.bad_queryexecution.failure.tool_not_foundexecution.failure.auth_misuseexecution.failure.state_errorLoops — Detects behavioral patterns where the agent gets stuck
repeating tool calls. These are distinct from
interaction.stagnation (conversation text repetition) and
execution.failure (single tool errors) — these detect tool-level
behavioral loops.
.. list-table:: :header-rows: 1 :widths: 30 70
execution.loops.retryexecution.loops.parameter_driftexecution.loops.oscillationExhaustion — Detects failures and constraints arising from the surrounding system rather than the agent's internal policy or reasoning. These are external issues the agent cannot control.
.. list-table:: :header-rows: 1 :widths: 30 70
environment.exhaustion.api_errorenvironment.exhaustion.timeoutenvironment.exhaustion.rate_limitenvironment.exhaustion.networkenvironment.exhaustion.malformed_responseenvironment.exhaustion.context_overflowSignals are computed automatically by the gateway after each assistant response and emitted as OpenTelemetry trace attributes and span events on your existing spans. No additional libraries or instrumentation are required — just configure your OTEL collector endpoint as usual.
Each conversation trace is enriched with layered signal attributes (category-level counts and severities) plus one span event per detected signal instance (with confidence, snippet, and per-detector metadata).
.. note::
Signal analysis is enabled by default and runs on the request path. It
does not affect the response sent to the client. Set
overrides.disable_signals: true in your Plano config to skip this
CPU-heavy analysis (see the configuration reference).
Signal data is exported as structured OTel attributes. There are two tiers: top-level attributes (always emitted on spans that carry signal analysis) and layered attributes (emitted only when the corresponding category has at least one signal instance).
Always emitted once signals are computed.
.. list-table:: :header-rows: 1 :widths: 40 15 45
signals.qualityexcellent, good, neutral, poor, severe.signals.quality_scoresignals.turn_countsignals.efficiency_score1 / (1 + 0.3 * (turns - baseline))).Emitted per category, only when count > 0. One .count and one
.severity attribute per category. Severity is a 0–3 bucket (see
Severity levels_ below).
.. list-table:: :header-rows: 1 :widths: 50 50
signals.interaction.misalignment.countmisalignment.* leaf typesignals.interaction.misalignment.severitysignals.interaction.stagnation.countstagnation.* leaf typesignals.interaction.stagnation.severitysignals.interaction.disengagement.countdisengagement.* leaf typesignals.interaction.disengagement.severitysignals.interaction.satisfaction.countsatisfaction.* leaf typesignals.interaction.satisfaction.severitysignals.execution.failure.countfailure.* leaf typesignals.execution.failure.severitysignals.execution.loops.countloops.* leaf typesignals.execution.loops.severitysignals.environment.exhaustion.countexhaustion.* leaf typesignals.environment.exhaustion.severityThe following aggregate keys pre-date the paper taxonomy and are still emitted for one release so existing dashboards keep working. They are derived from the layered counts above and will be removed in a future release. Migrate to the layered keys when convenient.
.. list-table:: :header-rows: 1 :widths: 50 50
signals.follow_up.repair.countsignals.interaction.misalignment.countsignals.follow_up.repair.ratiomisalignment.count / max(user_turns, 1))signals.frustration.countdisengagement.negative_stance instancessignals.frustration.severitysignals.repetition.countsignals.interaction.stagnation.countsignals.escalation.requesteddisengagement.escalation or disengagement.quit firedsignals.positive_feedback.countsignals.interaction.satisfaction.countIn addition to span attributes, every detected signal instance is emitted as
a span event named signal.<dotted-type> (e.g.
signal.interaction.satisfaction.gratitude). Each event carries:
.. list-table:: :header-rows: 1 :widths: 30 15 55
signal.typesignal.message_indexsignal.confidencesignal.snippetsignal.metadataSpan events are the right surface for drill-down: attribute filters narrow traces, then events tell you which messages fired which signals with what evidence.
When concerning signals are detected (disengagement present, stagnation
count > 2, any execution failure / loop, or overall quality poor/
severe), the marker 🚩 (U+1F6A9) is appended to the span's operation
name.
This makes flagged sessions immediately visible in trace UIs without
requiring attribute filtering.
Example queries against the layered keys::
signals.quality = "severe"
signals.turn_count > 10
signals.efficiency_score < 0.5
signals.interaction.disengagement.severity >= 2
signals.interaction.misalignment.count > 3
signals.interaction.satisfaction.count > 0 AND signals.quality = "good"
signals.execution.failure.count > 0
signals.environment.exhaustion.count > 0
For flagged sessions, search for 🚩 in span names.
.. image:: /_static/img/signals_trace.png :width: 100% :align: center
Every category aggregates its leaf signal counts into a severity bucket used
by both the layered .severity attribute and the overall quality score.
Severity is always computed per-category. For example, three instances of
misalignment.rephrase plus two of misalignment.correction yield
signals.interaction.misalignment.severity = 3 (5 instances total).
Signals are aggregated into an overall interaction quality on a 5-point scale. The scoring model starts at 50.0 (neutral), adds positive weight for satisfaction, and subtracts weight for disengagement, misalignment (when ratio > 30% of user turns), stagnation (when count > 2), execution failures, execution loops, and environment exhaustion.
The resulting numeric score maps to the bucket emitted in signals.quality:
Excellent (75 – 100) Strong positive signals, efficient resolution, low friction.
Good (60 – 74) Mostly positive with minor clarifications; some back-and-forth but successful.
Neutral (40 – 59) Mixed signals; neither clearly good nor bad.
Poor (25 – 39) Concerning negative patterns (high friction, multiple misalignments, moderate disengagement, tool failures). High abandonment risk.
Severe (0 – 24) Critical issues — escalation requested, severe disengagement, severe stagnation, or compounding failures. Requires immediate attention.
The raw numeric score is available under signals.quality_score.
In production, trace data is overwhelming. Signals provide a lightweight first layer of triage to select the small fraction of trajectories that are most likely to be informative. Per the paper, signal-based sampling reaches 82% informativeness on τ-bench versus 54% for random sampling — a 1.52× efficiency gain per informative trajectory.
Workflow:
This creates a reinforcement loop where traces become both diagnostic data and training signal for prompt engineering, routing policies, and preference-data construction.
.. note:: An in-gateway triage sampler that selects informative trajectories inline — with configurable per-category weights and budgets — is planned as a follow-up to this release. Today, sampling is consumer-side: your observability platform filters on the signal attributes described above.
A concerning session, showing both layered attributes and a per-instance event::
# Span name: "POST /v1/chat/completions gpt-5.2 🚩"
# Top-level
signals.quality = "severe"
signals.quality_score = 0.0
signals.turn_count = 4
signals.efficiency_score = 1.0
# Layered (only non-zero categories are emitted)
signals.interaction.disengagement.count = 6
signals.interaction.disengagement.severity = 3
# Legacy (deprecated, emitted while dual-emit is on)
signals.frustration.count = 4
signals.frustration.severity = 2
signals.escalation.requested = true
# Per-instance span events
event: signal.interaction.disengagement.escalation
signal.type = "interaction.disengagement.escalation"
signal.message_index = 6
signal.confidence = 1.0
signal.snippet = "get me a human"
signal.metadata = {"pattern_type":"escalation"}
Use signal attributes to build monitoring dashboards in Grafana, Honeycomb, Datadog, etc. Prefer the layered keys — they align with the paper taxonomy and will outlive the legacy keys.
signals.qualitysignals.turn_countsignals.efficiency_scoresignals.interaction.misalignment.count > 3signals.interaction.disengagement.severity >= 2signals.interaction.satisfaction.count >= 1disengagement.escalation or
disengagement.quit event fired (via span-event filter)signals.execution.failure.count > 0signals.environment.exhaustion.count > 0Set up alerts based on signal thresholds:
signals.quality = "severe" count exceeds threshold in a
1-hour windowsignals.interaction.disengagement.severity >= 2 (>2× baseline)signals.execution.failure.count > 0 — agent-caused
tool issuessignals.environment.exhaustion.count — external
system degradationsignals.turn_count up > 50%)Start simple:
severe sessions (or on spikes in severe rate)poor sessions within 24 hoursexcellent sessions as exemplarsCombine multiple signals to infer failure modes:
signals.interaction.stagnation.severity >= 2 +
signals.turn_count above baselinesignals.interaction.disengagement.severity >= 2 +
any escalation eventsignals.interaction.misalignment.count / user_turns > 0.3signals.execution.failure.count > 0 +
signals.interaction.misalignment.count > 0signals.environment.exhaustion.count > 0 while
signals.execution.failure.count = 0signals.interaction.satisfaction.count >= 1 +
signals.efficiency_score > 0.8 + no disengagementSignals don't capture:
Mitigation strategies:
.. note:: Behavioral signals complement — but do not replace — domain-specific response quality evaluation. Use signals to prioritize which traces to inspect, then apply domain expertise and outcome checks to diagnose root causes.
.. tip::
The 🚩 marker in the span name provides instant visual feedback in
trace UIs, while the structured attributes (signals.quality,
signals.interaction.disengagement.severity, etc.) and per-instance
span events enable powerful querying and drill-down in your observability
platform.
Signals: Trajectory Sampling and Triage for Agentic Interactions <https://arxiv.org/abs/2604.00356>_ — the paper this framework implements../guides/observability/tracing — Distributed tracing for agent
systems../guides/observability/monitoring — Metrics and dashboards../guides/observability/access_logging — Request / response logging../guides/observability/observability — Complete observability guide