docs/rfc/0002-ai-gateway-contextual-tools.md
This RFC proposes a layered design for letting the Jaeger UI attach a set of contextual (UI-driven) tools to a chat turn against the AI gateway, have the LLM invoke them through the existing Agent Client Protocol (ACP) sidecar, and receive the results back. It avoids new server processes and avoids polluting the shared Jaeger MCP server with conversation-scoped state by riding a single ACP extension method between the sidecar and the gateway, with a per-turn store inside the gateway as the correlation point.
The Jaeger AI gateway already lets a user chat with an LLM-backed sidecar (Gemini reference implementation) about traces. The sidecar uses Jaeger's MCP server to query backend data: search_traces, get_critical_path, get_service_dependencies, etc. These tools are process-scoped — they make sense to any MCP client that points at Jaeger.
Beyond these, an interactive Jaeger UI wants to expose UI-driven tools to the same LLM:
show_flamegraph(trace_id) — navigate the browser to the flamegraph view.highlight_span(span_id) — focus a span on the timeline.set_filter(...) — apply a filter the user is composing.These tools are fundamentally different from the backend ones:
We need a clean way to plumb these tools into the LLM's tool universe for a single turn, route the LLM's tool calls back to the originating browser, and tear everything down at turn end — without leaking conversation state into the shared MCP server, and without forking ACP or MCP.
In scope:
Out of scope:
Under consideration:
The Jaeger AI gateway is an HTTP handler inside the jaegerquery extension at POST /api/ai/chat. It:
{prompt} from the request body.agent_url, e.g. ws://localhost:16688).Initialize → NewSession → Prompt.session/update events back to the HTTP response as text/plain.The sidecar (Gemini reference) connects to Jaeger's MCP server over HTTP, discovers the built-in tools, registers them with Gemini, runs an agentic loop, and streams text + tool-call notifications back over ACP.
tools/list, tools/call.The sidecar ties them together by translating MCP tool definitions into Gemini FunctionDeclarations, dispatching Gemini-emitted function_calls over MCP, and reflecting the agentic loop's output back via ACP session/updates.
ACP defines:
Fs.ReadTextFile, Fs.WriteTextFile, Terminal. Each maps to a specific protocol method (e.g. fs/read_text_file). The client declares which ones it supports in InitializeRequest.ClientCapabilities, and the agent calls those exact methods.NewSessionRequest.McpServers. The agent then dials those servers and discovers tools through MCP.ACP does not define:
ClientCapabilities.CustomTools slot.Initialize or NewSession.So if we want the gateway to advertise UI tools to the agent, the only protocol-native mechanisms are:
NewSessionRequest.McpServers is the supported tool-injection slot), or_meta/... JSON-RPC methods between client and agent. The Coder ACP SDK's acp.NewConnection(dispatcher, ...) lets us register a custom dispatcher that handles such methods. This is how acp.SendRequest[T] already supports arbitrary method dispatch.We need to answer four questions:
FunctionDeclarations in its chat config. Built-in MCP tools and UI tools both need to land in that list.jaegermcp should remain a pure backend-data MCP server usable by any MCP client (Claude Code, Cline, etc.) without seeing UI concerns.list_contextual_tools MCP tool inside jaegermcpInitial design (now superseded). The jaegermcp extension exposes a new MCP tool list_contextual_tools(session_id) that returns the UI tools snapshot for the requested ACP session. The gateway pre-populates a ContextualToolsStore keyed by session id; the sidecar calls the MCP tool to discover UI tools.
Rejected because:
jaegermcp is meant as a backend-data MCP server usable by any MCP client (e.g. Claude Code with no browser at all). Surfacing a UI-only tool there is dead weight and confusing to non-Jaeger-UI consumers.tools/list.Spin up a small MCP server inside the jaegerquery extension that exposes only the contextual UI tools for a given turn. Mint a per-turn URL like /api/ai/mcp/{contextual_id}. The gateway adds this URL to NewSessionRequest.McpServers so the sidecar discovers UI tools the standard way.
Rejected because:
tools/call would still need a side channel back to the browser; we end up plumbing the result over the open HTTP chat-response stream anyway, so introducing an MCP server in between doesn't reduce moving parts — it adds them.Fs or Terminal capabilitiesEncode UI actions as fake filenames or terminal commands, declare those capabilities in InitializeRequest, and intercept the ACP method calls server-side.
Rejected because:
The gateway defines a custom ACP method _meta/jaegertracing.io/tools/call. Tool definitions travel one-way on NewSessionRequest.Meta (the spec-defined free-form metadata slot). Tool invocations travel as JSON-RPC requests over the existing ACP WebSocket from the sidecar back to the gateway via this extension method.
Selected because:
jaegermcp; that server stays purely backend-data.acp.NewConnection(dispatcher, ...) accepts a custom method dispatcher, so we don't fork the SDK.Trade-off accepted:
_meta/jaegertracing.io/tools/call to support contextual tools. We consider this acceptable: contextual tools are a Jaeger-UI feature, and an agent that doesn't implement it simply ignores the meta and never calls the extension method, leaving the system functional for built-in MCP tools only.flowchart TB
UI["Jaeger UI (Browser)"]
subgraph jaeger["Jaeger Process"]
direction LR
MCP["jaegermcp
(backend data tools)"]
subgraph handler["AI Gateway / ChatHandler"]
direction TB
ACPCONN["acp.Connection
(custom dispatcher)"]
DISPATCH["Dispatcher"]
SC["streamingClient"]
STORE["ContextualToolsStore
(per-turn snapshots)"]
ACPCONN -- inbound JSON-RPC --> DISPATCH
DISPATCH -- session/update --> SC
DISPATCH -- _meta/jaegertracing.io/tools/call --> STORE
end
end
subgraph sidecar["Agent Sidecar"]
AGENT["ACP Agent"]
LOOP["LLM Agentic Loop"]
MCPC["MCP Client"]
AGENT --> LOOP --> MCPC
end
UI -- "POST /api/ai/chat
(AG-UI RunAgentInput)" --> handler
SC -- "SSE (AG-UI events)" --> UI
handler -- "WebSocket (ACP, incl. ext method)" --> AGENT
MCPC -- "HTTP (MCP)" --> MCP
ContextualToolsStore (gateway-side)Thread-safe per-turn map of frontend-supplied tools, keyed by ACP session id. The chat handler writes the snapshot once NewSessionResponse returns and before Prompt is sent; the dispatcher reads the snapshot using the same sessionId the sidecar puts on the ext_method payload, so the lookup is unambiguous without any extra correlation field.
API:
SetForSession(sessionID, rawTools) — stores a snapshot, copying raw bytes; empty id is no-op; empty/all-invalid set deletes.DeleteForSession(sessionID) — turn-end cleanup.GetContextualToolsForSession(sessionID) — returns a fresh decoded copy per call so readers cannot corrupt the snapshot.Routes inbound JSON-RPC from the sidecar:
session/update → streamingClient.SessionUpdatesession/request_permission → always denied (no fs/terminal capability)_meta/jaegertracing.io/tools/call → handleJaegerToolCallMethodNotFoundhandleJaegerToolCall:
ui_ prefix from the inbound tool name (see §6.5).{result: {acknowledged: true}, isError: false}) immediately.The browser does not round-trip a tool result back. UI tools are
side effects (navigate, render, set filters); the browser executes them
locally based on the AG-UI TOOL_CALL_* SSE events that the streaming
client emits in parallel with the ext_method dispatch. See §6.6 for the
rationale.
NewSession: parses field_meta for the namespaced key jaegertracing.io/contextual-tools, stashes the snapshot per session_id.Prompt: builds a Gemini Tool from the snapshot and merges it with the discovered MCP tools before passing them to chats.create().function_call: dispatches via MCP for built-in tools; dispatches via the ACP extension method for contextual tools. Streams start_tool_call + update_tool_call session_updates as in the MCP path.Prompt end (success or error): pops the snapshot for the session.sequenceDiagram
participant UI
participant GW as Gateway
participant DSP as Dispatcher
participant SC as Sidecar
participant LLM as LLM (Gemini)
UI->>GW: POST /api/ai/chat
{prompt, tools, thread_id}
GW->>SC: ACP NewSession
Meta: {jaegertracing.io/contextual-tools: {prefixed_tools}}
SC->>SC: stash tools per session_id
SC-->>GW: NewSessionResponse {sessionId}
GW->>GW: ContextualToolsStore.SetForSession(sessionId, prefixed_tools)
GW->>SC: ACP Prompt
SC->>LLM: send_message (mcp + contextual tools merged)
LLM-->>SC: function_call
alt Built-in MCP tool
SC->>SC: call MCP server directly
else Contextual tool (fire-and-forget)
SC->>UI: SessionUpdate (start_tool_call) → AG-UI TOOL_CALL_START/ARGS
SC->>DSP: ACP _meta/jaegertracing.io/tools/call
{sessionId, name, args}
DSP->>DSP: strip ui_ prefix, log dispatch
DSP-->>SC: {acknowledged: true}
SC->>UI: SessionUpdate (update_tool_call status=completed) → AG-UI TOOL_CALL_END
Note over UI: browser executes side effect locally
end
SC-->>LLM: function_response
LLM-->>SC: final text
SC-->>GW: ACP session/update (text)
SC-->>GW: ACP PromptResponse (StopReason)
SC->>SC: drop contextual snapshot
GW->>GW: ContextualToolsStore.DeleteForSession(sessionId)
GW-->>UI: close HTTP response
The AG-UI protocol is per-run, not per-session: the frontend re-sends {thread_id, messages, tools} on every chat call. thread_id makes it feel like one chat; tools and messages are evaluated fresh every turn.
So the natural shape of "tools change when the user navigates" is already the protocol's shape. Turn 1 carries the Search-page tools; if a UI tool call navigates to Flamegraph, the next user message turn carries the Flamegraph-page tools. No mid-turn tool swap is needed.
ui_: the gateway prepends UIToolPrefix to every contextual tool name before exposing it on the meta payload. This guarantees that a frontend-supplied tool can never collide with a built-in Jaeger MCP tool of the same name (e.g. search_traces). The dispatcher strips the prefix on the inbound call so the AG-UI client receives the original frontend name.jaegertracing.io/contextual-tools: namespaced under our domain to avoid collision with other meta consumers._meta/jaegertracing.io/tools/call: same namespacing.UI tools are commands, not queries. show_flamegraph(trace_id),
highlight_span(span_id), set_filter(...) — none of these have a
meaningful return value to feed back into the LLM. The model's job is to
decide when to invoke them; the browser's job is to perform the side
effect once it sees the call.
This rules out the more obvious synchronous-round-trip design (where the gateway blocks on the ext_method waiting for the browser to POST a tool result) for several reasons:
{ok: true}, which is what the gateway can emit on its own.POST /api/ai/tool-result endpoint plus per-call rendezvous
state inside the gateway, with timeout handling and orphan cleanup.
Fire-and-forget needs none of that.Prompt call and produces a final answer in one
SSE stream — the user sees one coherent response rather than a turn
ending mid-stream while waiting for browser execution.TOOL_CALL_START/ARGS/END SSE events and acts on them; whether or not a
matching result message goes back is up to the agent. We choose not to.ContextualToolsStore.Set skips entries that do not parse; an all-invalid set deletes any prior entry rather than persisting an empty slice.sessionId/name: dispatcher rejects with InvalidParams; the sidecar surfaces this as a tool error to the LLM.ui_: same InvalidParams rejection — guards against a malformed prefix that would otherwise produce an empty tool name.ui_ prefix: dispatcher logs a warning and passes the name through unchanged so older sidecars keep working during a phased prefix rollout.defer DeleteForSession cleans up.The work landed in two PRs against jaegertracing/jaeger:
ContextualToolsStore keyed by ACP session id, with defensive
JSON validation and clone-on-set semantics.acp.NewConnection(newDispatcher(...))) that
routes the standard session/update and session/request_permission
methods plus the new _meta/jaegertracing.io/tools/call extension method.ui_ prefix,
and returns a placeholder result. Real wiring (Meta population, store
writes, SSE emission) follows in PR2.field_meta, registers contextual
tools with Gemini, and dispatches _meta/jaegertracing.io/tools/call
back via conn.ext_method when the LLM picks one.RunAgentInput (messages, tools,
context, threadId, runId) and emits AG-UI SSE events
(RUN_STARTED, TEXT_MESSAGE_START/CONTENT/END,
TOOL_CALL_START/ARGS/RESULT/END, RUN_FINISHED, RUN_ERROR).translation.go with helpers for extracting prompt text, context
entries, and encoding tools.streamingClient rewritten to translate ACP session/update events to
AG-UI SSE frames; lifecycle (startRun / finishRun / failRun) is
driven by the chat handler.NewSessionRequest.Meta with the prefixed
contextual-tools snapshot, calls SetForSession after
NewSessionResponse, and defers DeleteForSession.{result: {acknowledged: true}}).jaeger-query HTTP server applies.
A future PR may want to gate AI features behind a separate flag.ui_ prefix and non-empty
name, the gateway does not validate user-supplied tool definitions.
A future PR may want to bound tool count, parameter-schema size, or
reject reserved characters.jaeger-ui repo and is tracked separately.NewSessionRequest.Meta{
"_meta": {
"jaegertracing.io/contextual-tools": {
"tools": [
{
"name": "ui_show_flamegraph",
"description": "Open the flamegraph view for a given trace_id.",
"parameters": {
"type": "object",
"properties": {
"trace_id": { "type": "string" }
},
"required": ["trace_id"]
}
}
]
}
}
}
Method: _meta/jaegertracing.io/tools/call
Request payload:
{
"sessionId": "sess-42",
"name": "ui_show_flamegraph",
"args": { "trace_id": "abc123" }
}
Response payload:
{
"result": { "acknowledged": true },
"isError": false
}
The gateway always returns this fire-and-forget acknowledgement once the
payload validates and the prefix strip succeeds. Errors during validation
yield JSON-RPC InvalidParams; everything else short-circuits to the ack.
| Constant | Value | Owner |
|---|---|---|
CONTEXTUAL_TOOLS_META_KEY | jaegertracing.io/contextual-tools | Gateway + Sidecar |
ExtMethodJaegerToolCall | _meta/jaegertracing.io/tools/call | Gateway (Go) |
EXT_METHOD_JAEGER_TOOL_CALL | meta/jaegertracing.io/tools/call | Sidecar (Python — runtime adds the leading _) |
UIToolPrefix | ui_ | Gateway |