showcase/integrations/built-in-agent/PARITY_NOTES.md
This file documents the deliberate adaptations, divergences, and outstanding
gaps between the built-in-agent (BIA) showcase integration and the
LangGraph-Python (LGP) reference integration. Auditors, harness authors, and
D6 probes should consult this before flagging "missing" parity items.
Every demo in this integration targets the agent literal default. Per-demo
specialization (system prompt, tool surface, factory hooks) happens at the
API-route + factory layer; see src/lib/factory/ for the per-route factory
wiring.
Harness selectors, e2e specs, and D6 probes that key off agent-id MUST accept
default for this integration. Do not assume the agent-id matches the demo
slug — that is the LGP convention, not BIA's.
The following demos render the same UX as the LGP equivalents but use a
different primitive under the hood. They are deliberate adaptations because
BIA has no interrupt() primitive:
gen-ui-interrupt — uses useFrontendTool with an async handler instead
of useInterrupt + CUSTOM_EVENT. The Promise returned by the async
handler resolves when the user picks a slot (or cancels).interrupt-headless — same Strategy-B handler model, no chat UI.These are full-capability demos and SHOULD NOT be added to
not_supported_features. They are currently quarantined for a separate
upstream reason (see Reasoning-trio below).
shared-state-read-write — UI divergenceBIA's shared-state-read-write demo uses a notes-card UI; LGP uses a
recipe-card UI. This is a UX choice, not a capability gap. The
underlying useCoAgent read/write contract is identical.
Drop notes-card vs. recipe-card divergence from any "missing testids" expectation set when comparing BIA to LGP — harnesses should match on capability, not on the specific component rendered.
tool-rendering companion components — deferredThe per-tool renderers (weather-card, flight-card, stock-card,
d20-card, custom-catchall) used by tool-rendering in LGP have not yet
been ported to BIA. This is tracked as a follow-up PR — the demo wiring is
present but currently renders against the default catch-all only.
tool-rendering-default-catchall — built-in kit responsibilityThe shadcn-catchall-* testid expectation lives in
@copilotkit/react-ui (the built-in default renderer ships from the kit,
not from the integration). PM escalation is pending to confirm whether the
testid should ship from the kit; until then, BIA cannot satisfy the
expectation by patching its own source.
The following three demos are listed in manifest.yaml under
not_supported_features pending a @copilotkit/react-core package release
that fixes a useInterrupt/useHeadlessInterrupt RESUME-PATH bug (the
backend resumes fine but the frontend never appends the confirmation
bubble):
reasoning-default-renderagentic-chat-reasoningtool-rendering-reasoning-chainBackend reasoning-event emission is also TBD on the built-in agent factory.
Once the upstream react-core fix lands AND the factory emits
REASONING_MESSAGE_* events, the quarantine should be lifted in the same
PR that bumps @copilotkit/react-core.
Two demos render a graceful "not supported" banner with
data-testid="not-supported-banner" so the harness can detect them
deterministically instead of timing out on missing UI (mount wired by
PR #5413, commit 3585c33b8):
gen-ui-interrupt (NSF-quarantined: BIA has no interrupt() primitive;
see Strategy-B adaptations above for the async-handler model used by the
non-quarantined HITL demos)shared-state-streaming (BIA has no per-token state-delta streaming)The dashboard-labeled "In-chat" and "In-app" HITL demos (hitl-in-chat,
hitl-in-app) are GREEN on staging — they are NOT NSF. They use
useFrontendTool with async handlers per the Strategy-B adaptation
documented above.
D6 probes should treat a not-supported-banner hit as PASS-SKIPPED, not
FAIL.
multimodal — copilot-add-menu-buttonThe copilot-add-menu-button testid is rendered by
@copilotkit/react-core/v2's CopilotChatInput (see
packages/react-core/src/v2/components/chat/CopilotChatInput.tsx). It is
present in the published kit; no BIA-side cell change is required. The
multimodal demo styles the menu button via a wrapper CSS selector — see
LGP's multimodal-chat.tsx for the pattern.
not_supported_features in manifest.yaml in the same commit.PR #5425 added the necessary integration-layer plumbing for these demos
(source-level testids, aimock fixtures, factory backend wiring), but D6
runs revealed that the remaining failures live DOWNSTREAM of the
integration layer — in the A2UI renderer host and the AG-UI →
useAgent/useCoAgent state-subscription path. Those fixes belong to
upstream packages (@copilotkit/react-core, A2UI renderer host) and are
tracked as a follow-up PR. This PR's diff is correct at the integration
layer.
a2ui-fixed-schema — RED (testid never mounts)a2ui-fixed-card testid never appears in DOM.display_flight tool fires (✓).declarative-gen-ui — RED (testids never mount)declarative-card and declarative-metric testids
never appear in DOM.generate_a2ui correctly (✓).a2ui-fixed-schema — the host does not mount the projected components
despite a valid generation stream.a2ui-fixed-schema renderer-host fix.gen-ui-agent — RED (state never reaches frontend)StepsPanel stays in its placeholder "No plan yet"
state for the full run.set_steps tool emits STATE_DELTA
correctly (verified in tanstack-factory.ts), factory wiring is sound
(✓).useAgent / useCoAgent subscriber
receives no state update — there is a wire-up gap between AG-UI
STATE_DELTA emission and the React hook's consumer. The placeholder
never flips to the rendered plan.packages/react-core (AG-UI middleware /
useAgent / useCoAgent state-subscription path), not the BIA
integration.packages/react-core.BIA registers get_weather / get_stock_price / get_revenue_chart / highlight_note as server-executed tools via TanStack's chat() engine. After the LLM returns a tool call, TanStack runs the server tool and reprompts the LLM with the result; the original user pill text remains in conversation history, so userMessage-keyed toolcall fixtures would naively fire on every reprompt and the loop never converges. BIA's /v1/responses endpoint also rewrites assistant tool_call_ids to runtime-generated fc-… values, breaking the toolCallId-keyed narration fallback that works on non-rewriting backends.
Resolution (#5427 follow-up): d6/built-in-agent/gen-ui-headless-complete.json now structures each pill as a (sequenceIndex:0 emitter, narration fallback) pair. The emitter matches the FIRST request for the pill prompt (counter starts at 0) and emits the tool call; subsequent BIA reprompt iterations fall through the now-exhausted emitter to the narration fallback (no tool call), so the loop converges. sequenceIndex is chosen over hasToolResult:false because hasToolResult is computed across the entire thread — any earlier pill's tool result would permanently disable a hasToolResult:false emitter, breaking multi-turn sessions.
This pattern is BIA-specific because LGP runs these tools INSIDE the Python agent and emits them as AG-UI events directly — no TanStack reprompt cycle — so LGP's gen-ui-headless-complete.json retains the simpler userMessage-only emitter pattern.
PARITY_NOTES inaccuracies surfaced by staging verify after PR #5413 merge — fixed 2026-06-12.