Back to Copilotkit

How Threads & Persistence Work

showcase/shell-docs/src/content/docs/premium/threads-explained.mdx

1.57.08.7 KB
Original Source
<ThreadsEarlyAccess>

What are threads?

A thread is a persistent, server-side container for a multi-turn conversation between a user and an agent. Unlike ephemeral chat sessions that disappear when the page reloads, threads store the full event history — every message, tool call, and state change — so conversations can be paused, resumed, and replayed across sessions and devices.

Threads are a platform-level concept, not tied to any specific agent framework. Whether your backend uses LangGraph, Mastra, CrewAI, or any other framework, threads work the same way.

Key concepts

Thread vs. Run

A thread is the durable container. A run is a single agent execution within that thread. One thread can have many runs — each time the user sends a message and the agent responds, that's a new run. The thread accumulates events across all its runs.

How the pieces fit together

From a developer's perspective, threads involve three things:

What you useWhat it does
useThreads hookLists, renames, archives, and deletes threads. Pagination via hasMoreThreads / fetchMoreThreads. Stays in sync across tabs and devices via WebSocket.
CopilotChat with threadIdConnects to a specific thread, loads its history, and streams new events in realtime.
CopilotRuntimeServer-side layer that executes agents, stores thread data on the Intelligence Platform, and relays events to connected clients.

You interact with the first two. The runtime and platform handle persistence and sync behind the scenes.

Auto-naming

When a new thread is created and the first run completes, the runtime automatically generates a short name (2–5 words) using the LLM. This runs asynchronously — it doesn't block thread creation or the agent's response. The generated name appears in useThreads via the realtime sync.

Auto-naming is enabled by default. Disable it with generateThreadNames: false on the runtime. Users can always override the generated name via renameThread().

Archive vs. delete

Threads support two removal operations with different semantics:

  • Archive — a soft delete. The thread remains stored but disappears from the default list. Show archived threads by passing includeArchived: true to useThreads. Threads can also be unarchived, which restores them to the active list.
  • Delete — permanent and irreversible. The thread and its history are removed entirely.

Neither operation has a built-in confirmation dialog — your application should implement its own if needed.

How it works

Starting a new conversation

When a user sends their first message on a new thread:

  1. Your app renders CopilotChat (with or without a threadId — if omitted, a new thread is created automatically).
  2. The runtime creates the thread on the Intelligence Platform and begins executing the agent.
  3. Events stream back to the client via WebSocket in realtime — messages, tool calls, and state updates appear as they happen.
  4. Once the first run completes, the runtime auto-generates a thread name (if enabled).

Resuming a conversation

When a user returns to an existing thread (e.g., by clicking a thread in the sidebar), the client needs to catch up on any events it missed:

  1. CopilotChat receives the new threadId and requests the thread's history from the platform.
  2. The platform checks whether the thread has a run in progress:
    • No active run — the platform returns historical events only. The client replays them to reconstruct the conversation.
    • Active run — the platform returns historical events plus opens a WebSocket connection. The client replays the history, then receives live events as they stream in.
  3. In either case, the transition from replayed history to live updates is seamless.

Switching threads

When the threadId prop on CopilotChat changes:

  1. Any active run on the current thread is detached.
  2. All messages and agent state are cleared.
  3. The new thread's history is fetched and replayed.
  4. A WebSocket connection is established for live updates on the new thread.

The UI briefly shows an empty chat before the history loads. This is by design — it prevents stale messages from the previous thread appearing in the new one.

Safe during tool execution: If a tool call from the old thread completes during a switch, its result is discarded rather than inserted into the new thread's messages.

Realtime sync

The useThreads hook maintains a WebSocket subscription for thread metadata changes. When any client creates, renames, archives, or deletes a thread, the update is pushed to all connected clients automatically. This is how a thread created on one tab appears in the sidebar on another tab without polling.

Pessimistic updates

Thread mutations (rename, archive, delete) use a pessimistic update model — the client waits for the server to confirm via WebSocket before updating the thread list. This means:

  • The thread list doesn't change until the server confirms the operation
  • If the server rejects the mutation, the UI never shows an incorrect state
  • The returned promise resolves only after server confirmation, or rejects on failure

Error handling

Mutation failures

All mutation methods (renameThread, archiveThread, deleteThread) return promises that reject with an Error if the server cannot complete the operation. Common causes:

  • Network failure — the client can't reach the runtime
  • Thread not found — another client deleted the thread before your mutation arrived
  • Authorization failure — the user doesn't have permission to modify the thread
  • Timeout — the server didn't respond within 15 seconds

The error field on useThreads always reflects the most recent error. It resets to null on the next successful operation.

WebSocket disconnection

If the WebSocket connection drops (network change, server restart, laptop sleep):

  • Thread listuseThreads stops receiving realtime updates. The list becomes stale until the connection is re-established. Reconnection is automatic with exponential backoff.
  • Active conversation — if CopilotChat loses its WebSocket mid-run, the agent's output may be interrupted. Reloading the page or switching away and back to the thread triggers the reconnection flow, which replays any missed events.

Thread locked

If a thread already has an active run and another client tries to start a new run on the same thread, the request is rejected with a 409 Conflict. This prevents two agent runs from interleaving events on the same thread. The existing run must complete or be stopped before a new one can begin.

The runtime acquires a Redis-backed lock on the thread for the duration of each run. You can tune this behavior on the runtime:

OptionDefaultMaxDescription
lockTtlSeconds203600 (1 hour)How long the lock is held before it expires automatically.
lockHeartbeatIntervalSeconds153000 (50 min)How often the runtime renews the lock during a run. The heartbeat always runs — you only need to adjust the interval.
lockKeyPrefixCustom Redis key prefix for the thread lock. Useful when multiple apps share a Redis instance.

If a run completes normally, the lock is released immediately. The TTL is a safety net for cases where the runtime crashes without releasing the lock.

Design decisions

Why event replay instead of message snapshots?

Threads store the raw event stream rather than a snapshot of the final message list. This enables:

  • Partial replay — when reconnecting, the client only fetches events it missed rather than reloading the entire history
  • Faithful reproduction — streaming tokens, tool calls, and state changes replay exactly as they originally occurred

The trade-off is that replay is more complex than loading a message array. The platform handles this complexity so your application doesn't have to.

When threads are the wrong tool

  • Ephemeral interactions — if your users don't need conversation history (e.g., a one-shot Q&A widget), threads add unnecessary complexity. Use CopilotChat without a threadId.
  • Client-only state — if you need local-only chat history without server persistence, manage messages in React state or localStorage instead.

Next steps

</ThreadsEarlyAccess>