Back to Copilotkit

Generative UI Overview

showcase/shell-docs/src/content/docs/concepts/generative-ui-overview.mdx

1.57.010.8 KB
Original Source

Generative UI is any interface where an AI agent helps decide what appears on the screen — from picking which pre-built component to render, all the way up to streaming raw HTML. As agents become more capable, an agentic application's UI itself becomes more of a dynamic output of the system: able to adapt, reorganize, and respond to user intent and application context. This can be done in very different ways, each with its own tradeoffs.

UI isn't just pixels — it's contracts: accessibility, performance, usability, analytics, and safety. The right type of generative UI is the one that lets the agent compose the experience your users need without compromising those contracts.

This page covers:

  • Application Surfaces — where Generative UI shows up within an agentic application.
  • The Three Types — Controlled, Declarative, and Open-Ended, with the freedom-vs-control tradeoff each one makes.
  • Ecosystem Mapping — how the types map to the protocols and specs out there today.
  • AG-UI and CopilotKit — how the protocol and the framework support all three types.

Application Surfaces

Generative UI surfaces in different parts of an application depending on how users interact with the agent and how much the application mediates that interaction. Three common shapes:

1. Chat (threaded interaction)

<TwoColumnSection imagePosition="right" imageSrc="/images/generative-ui/chat-surface.png" imageAlt="Chat Surface - Threaded Interaction" imageWidth={300} imageHeight={200}

A Slack-like conversational interface where the app brokers each turn. Generative UI appears inline as cards, blocks, or tool responses.

Key traits. Turn-based, message-driven flow. The app mediates all agent communication. Great for support, Q&A, debugging, and guided workflows.

Examples. Slack bots, Discord bots, Intercom AI Agent, Zendesk AI, GitHub Copilot Chat, Notion AI Chat.

</TwoColumnSection>

2. Chat+ (co-creator workspace)

<TwoColumnSection imagePosition="left" imageSrc="/images/generative-ui/chat-plus-surface.png" imageAlt="Chat+ Surface - Co-Creator Workspace" imageWidth={600} imageHeight={400}

A side-by-side or multi-pane layout: chat in one pane, a dynamic canvas in another. The canvas becomes a shared working space where agent-generated UI appears and evolves.

Key traits. Chat remains present but secondary. The canvas displays structured outputs and previews. Generative UI can appear in either pane. Ideal for creation, planning, editing, and multi-step tasks.

Examples. Figma AI, Notion AI workspace, Google Workspace Duet side-panel, Replit Ghostwriter paired editor.

</TwoColumnSection>

3. Chatless (generative UI integrated into the application UI)

<TwoColumnSection imagePosition="right" imageSrc="/images/generative-ui/chatless-surface.png" imageAlt="Chatless Surface - Integrated Application UI" imageWidth={300} imageHeight={200}

The agent doesn't talk directly to the user. Instead, it communicates with the application through APIs, and the app renders generative UI from the agent as part of its native interface.

Key traits. No chat surface at all. The app decides when and where generative UI appears. Feels like a built-in product feature, not a conversation. Ideal for dashboards, suggestions, and autonomous task helpers.

Examples. Microsoft 365 Copilot (inline editing), Linear Insights, Superhuman AI triage, HubSpot AI Assist, Datadog Notebooks AI panels.

</TwoColumnSection>

The Three Types

Three distinct patterns sit on a spectrum from maximum control to maximum freedom, and each makes a different tradeoff between safety, predictability, and expressive range.

TypeAgent's roleVisual freedomSafety / predictability
Controlled (also called Static)Picks which pre-built component to render and supplies its parametersLowestHighest
DeclarativeAssembles a UI from a curated registry by emitting a structured schemaMiddleHigh
Open-Ended (also called Fully Generated)Streams arbitrary HTML/CSS or full app markupHighestLowest

1. Controlled

The agent does not generate UI. It calls a tool with structured arguments; your frontend renders a hand-crafted component bound to that tool. The component, its styling, and its accessibility story are all author-controlled — the agent's only job is to decide which tool to call and what data to pass.

Concrete example. Your weather assistant has a displayWeather tool whose handler returns { city, temperature, conditions }. The frontend renders a polished <WeatherCard> from those args every time, with consistent layout, brand styling, and ARIA labels.

When to pick it. Mission-critical workflows where reliability and accessibility validation matter most — anything user-facing in a regulated environment, any view that must be analytics-instrumented or screen-reader-tested. Trade off some flexibility for a UI you can guarantee.

Tradeoffs. The more use cases, the more components you must build and maintain — the frontend codebase grows roughly proportional to the number of agent capabilities.

In CopilotKit, this is the tool-rendering pattern (and its state-rendering and reasoning variants for streamed agent state and reasoning blocks).

2. Declarative

The agent emits a structured specification — not arbitrary HTML, not just tool args, but a schema that names which components from a registry to compose, with what props, in what order. The frontend renders the specification through a registered component catalog you control.

Concrete example. Your dashboard agent emits:

json
{
  "components": [
    { "type": "WeatherCard", "props": { "city": "Tokyo" } },
    { "type": "LineChart",   "props": { "metric": "rainfall_mm", "period": "7d" } },
    { "type": "TableCard",   "props": { "rows": [...] } }
  ]
}

Your frontend has registered WeatherCard, LineChart, and TableCard; it composes them as the agent specified. Components new to the agent's vocabulary aren't renderable, by design.

When to pick it. The sweet spot for agentic apps — dashboards, chat-driven assistants, anything where the agent should be free to choose the layout but every component has been pre-approved by your team. You get creativity within a guardrail. The same spec can also render across multiple frameworks (React, mobile, desktop), giving you a clean separation between application logic and presentation.

Tradeoffs. Custom UI patterns may not be expressible in your schema. Visual differences can creep in if specs are interpreted differently across renderers.

CopilotKit supports declarative gen-UI through A2UI (dynamic schema, fixed schema) and the broader open-spec ecosystem (Open JSON UI, and others).

3. Open-Ended

The agent streams arbitrary markup — HTML, CSS, sometimes full mini-applications. The frontend treats it as untrusted output, sandboxes it (typically in an iframe), and renders. There's no registry, no fixed schema; the agent's expressive range is bounded only by what HTML can describe.

Concrete example. A user asks an agent to draft a contact form for a vintage record shop. The agent streams a complete HTML page (form, styles, even a dash of CSS animation), and the frontend embeds it in a sandboxed iframe so the user can interact with it.

When to pick it. Prototyping, build-time scaffolding, design exploration, internal tools where time-to-something-clickable matters more than guarantees. Generally not appropriate for runtime production traffic — accessibility, analytics, brand consistency, and security all become harder to enforce on output the agent invents from scratch.

Tradeoffs. Security and performance considerations when rendering arbitrary content. Typically web-first and difficult to port to native environments. Styling consistency and brand alignment become challenging.

CopilotKit supports open-ended gen-UI through MCP Apps — MCP servers that ship interactive UI alongside their tools, embedded in your app via the AG-UI protocol.

Ecosystem Mapping

The three types map cleanly onto the protocols and specifications out there today. No single approach is superior — the best choice depends on your application's priorities, surfaces, and UX philosophy.

<EcosystemTable data={[ { approach: "Controlled", examples: "AG-UI tool-rendering, CopilotChat, useAgent", strengths: "Fidelity, reliability, brand control", weaknesses: "Engineering intensive, linear growth" }, { approach: "Declarative", examples: "A2UI, Open-JSON-UI", strengths: "Balanced, scalable, multi-renderer", weaknesses: "Limited full customization" }, { approach: "Open-Ended", examples: "MCP Apps, ChatGPT Apps", strengths: "Unlimited creativity", weaknesses: "Hard to secure, web-first" } ]} />

AG-UI and CopilotKit are gen-UI agnostic

CopilotKit doesn't pick the type for you. The same React frontend, the same runtime, and the same AG-UI protocol support all three patterns side by side — your weather assistant can render a Controlled <WeatherCard> for the temperature lookup and embed an Open-Ended MCP app for the visualization, in the same chat thread, on the same page. The choice is per-feature, not per-product.

<Image src="/images/gen-ui-specs-light.png" alt="AG-UI supporting multiple generative UI specifications" width={600} height={400} className="block dark:hidden mb-4 w-full mx-auto" /> <Image src="/images/gen-ui-specs-dark.png" alt="AG-UI supporting multiple generative UI specifications" width={600} height={400} className="hidden dark:block mb-4 w-full mx-auto" />

AG-UI adds shared primitives — interaction models, context synchronization, event handling, a common state framework — that standardize how agents and UIs communicate across all three types. That gives developers a consistent mental model while still letting each feature pick the gen-UI pattern that fits its risk profile.

Where to go next