showcase/GOTCHAS.md
What we learned from getting all 18 integrations to D5 green. Many of these are things that were "green" but still wrong — passing probes while the underlying wiring was fragile, framework-specific, or relying on coincidence. This document exists so we don't re-learn these when rebuilding.
V1 vs V2 CopilotKit imports cause silent failures. V1 is @copilotkit/react-core, V2 is @copilotkit/react-core/v2. Mixing V1 provider with V2 hooks (e.g., useRenderTool) silently fails — the tool rendering pipeline never wires up. Agent discovery also breaks: V2 runtime needs V2 provider. Found on: ms-agent-dotnet (auth), built-in-agent (interrupts), spring-ai (tool-rendering).
Custom assistantMessage slot renderers must carry data-testid="copilot-assistant-message". The byoc-hashbrown demo overrides the slot with a HashBrown renderer. Without the testid, the probe (and any external consumer) sees 0 assistant messages. Every integration's hashbrown-renderer.tsx needed this fix independently. This is the #1 argument for a shared frontend.
copilotRuntimeNextJSAppRouterEndpoint() must be hoisted to module scope. Calling it inside the POST handler (per-request) causes handleServiceAdapter to repeatedly re-wrap runtime.agents in Promise layers. Under concurrent requests (/info + agent/run), this creates a race condition where the agents list is stale. Found on: google-adk (all 6 dedicated routes).
Agent names must match exactly between frontend and backend. useAgent("agentic-chat-reasoning") must match the backend registration. Dashes vs underscores, trailing hyphens, typos — all cause silent "Agent not found" errors that crash the page via React error boundary.
onRunInitialized multimodal shim is framework-dependent. langgraph-python NEEDS it (the @ag-ui/langgraph converter only understands legacy binary parts). langroid does NOT need it (speaks AG-UI directly — adding the shim causes double-encoding). Per-framework boolean, not a universal.
Content parts from AG-UI arrive as Pydantic model instances, not dicts. isinstance(part, dict) silently drops them. Must check hasattr(part, "model_dump") and call model_dump(by_alias=True). Affected: langroid, ms-agent-python, pydantic-ai.
from __future__ import annotations breaks Pydantic tool schemas. PEP 563 makes all annotations strings. When LlamaIndex AGUIChatWorkflow passes backend_tools to Pydantic for schema generation, Annotated[str, "..."] is a raw string instead of a resolved type. Affected: llamaindex, crewai-crews, pydantic-ai, ag2. Fix: remove the import from files defining tools.
a2ui_dynamic graph owns generate_a2ui tool — runtime MUST NOT auto-inject (injectA2UITool: false). Double-injection confuses the LLM.server.mjs must register ALL graphs from langgraph.json. We found it registering 5 of 25 — every unregistered graph returned 404./ok (langgraph-cli convention), not /health.langchain>=1.2.0 imports from langgraph.runtime.ExecutionInfo which doesn't exist in langgraph==1.0.5.deploymentUrl matters for dedicated API routes. Missing it causes 404.reasoning=True does multi-call chain-of-thought which breaks aimock (only first call matches). Disable for aimock-backed tests.STEP_STARTED/STEP_FINISHED for reasoning — CopilotKit ignores these. The reasoningMessage slot requires REASONING_MESSAGE_* events. We built a custom handler, then reverted to stock AGUI.StreamingToolAgent.streamFirstTurn() must include toolCallbacks with internalToolExecutionEnabled=false. Without this, aimock can't match toolName: "get_weather" — falls through to text-only fixture, weather card never renders.mvn install in Dockerfile.{ weatherTool } expands to function name "weatherTool", not "get_weather". Must use explicit keys: { get_weather: weatherTool }.byocHashbrownAgent needs its own dedicated agent with the hashbrown system prompt. The weatherAgent produces plain text → useJsonParser returns null → empty dashboard → timeout.AgentFrameworkAgent.run() expects input_data: dict. The _MultimodalAgent override used *args/**kwargs → TypeError at runtime.yield events (async generator), not return (coroutine).store=True breaks aimock fixture matching. Set store=False.CopilotKit import → agent discovery failed → "Agent not found".BuiltInAgent runs in-process in Next.js.type: "tanstack" with convertTanStackStream has a runFinished flag that blocks ALL events after first RUN_FINISHED. For byoc, must use type: "custom".response_format: { type: "json_object" } through TanStack adapter. The call silently fails — aimock never receives a request.from __future__ import annotations breaks InterruptScheduling import stubs in tests._classify_binary_part() has the isinstance(part, dict) bug. Pydantic models from AG-UI need model_dump().starlette>=1.0.0 removes on_startup. Pin starlette<1.0.0.from __future__ import annotations breaks Pydantic tool schema validation specifically when backend_tools are present but the response is text-only._normalize_part() must handle Pydantic model instances via model_dump(by_alias=True).onRunInitialized multimodal shim.AGUIStream requires plain string content. Multipart arrays cause 400 errors. ContentFlattenerShim handles conversion.max_consecutive_auto_reply or hasToolResult in fixtures.copilotRuntimeNextJSAppRouterEndpoint() per-request → race condition. Fixed by hoisting to module scope.Check fixtures FIRST. When an agent misbehaves through aimock, the fixture determines behavior — the real LLM is never consulted.
sequenceIndex counters are global. They persist across all test runs within the same aimock process. Use hasToolResult (stateless) instead for fixtures shared across integrations.
Tool-rendering fixtures need toolName in match criteria. If the request doesn't include tool definitions, the fixture falls through to text-only. Spring-ai omitted tools; mastra's shorthand keys produced wrong function names.
PDF turn is fragile. Two-turn multimodal probe: if turn 2's message doesn't match the PDF fixture, the image fixture matches instead. The PDF fixture must be the most specific match.
copilotRuntimeNextJSAppRouterEndpoint + ExperimentalEmptyAdapter instead of V2's createCopilotRuntimeHandler + InMemoryAgentRunner. Only built-in-agent fully uses V2. The V1 API has the per-request race condition (hoisting to module scope was a band-aid, not a migration)."default", not because the correct agent was wired.onRunInitialized shim applied where unnecessary — worked by coincidence because legacy format round-tripped correctly.