docs/advanced/multi-round-trip.md
Sometimes a tool can't finish in one round trip. It needs something only the user has: a choice, a confirmation, a credential.
Before 2026-07-28 the server got it by calling back: opening its own request to the client (an elicitation, a sampling call) in the middle of handling the original one. The 2026-07-28 spec retires that back-channel.
Instead, the server returns.
The server answers tools/call with an InputRequiredResult instead of a CallToolResult. Two of its fields do the work:
input_requests: what the server still needs, as a dict keyed by names the server chose. Each value is an ElicitRequest, a CreateMessageRequest, or a ListRootsRequest.request_state: an opaque token. The client echoes it back verbatim on the retry. Your server is the only thing that reads it.The client fulfils each request, then calls the same tool again, carrying its answers in input_responses and the token in request_state. The server now has what it was missing and returns a normal CallToolResult.
That's the whole protocol. Every leg is an ordinary request from the client to the server. Nothing ever flows the other way.
On @mcp.tool() you rarely build this by hand: declare a dependency that asks the user and the SDK returns the InputRequiredResult for you - that form is the Dependencies tutorial. The two forms don't mix: a call has one input_responses/request_state channel, so a tool that uses Resolve(...) parameters cannot also return InputRequiredResult from its body. A declared InputRequiredResult return is rejected at registration (InvalidSignature), and an undeclared one fails the call at runtime. The manual form is the low-level Server, whose on_call_tool handler is allowed to return either result type:
--8<-- "docs_src/mrtr/tutorial001.py"
on_call_tool is typed -> CallToolResult | InputRequiredResult. Returning the second one is the entire server-side API.params.input_responses is None, so the guard fires and the handler asks instead of answering.ElicitResult the client sent is sitting under the same key ("region") that the server used in input_requests.Everything else in that file (the explicit input_schema, the hand-built CallToolResult) is the ordinary low-level Server, covered in The low-level Server. This page only adds the second return type.
tools/call is not special: at 2026-07-28 a server may answer prompts/get and resources/read the same way. On MCPServer, an @mcp.prompt() function — or an @mcp.resource() template function — returns the InputRequiredResult itself and reads the retry's answers off the context:
--8<-- "docs_src/mrtr/tutorial004.py"
InputRequiredResult. On the retry, ctx.input_responses holds the answers under the same keys and the function returns its ordinary result — prompt messages here, resource content for a template resource.request_state you set is sealed before it crosses the wire and verified on the echo, like everything else on the server; Protecting requestState below covers what the seal gives you and when you need to configure keys.@mcp.tool() function can return the result directly the same way, when the dependency form doesn't fit.@mcp.resource() functions don't participate: they take no Context, so they could never read the retry. Only template resources can ask.InputRequiredResult on a pre-2026 session is the same -32603 the warning describes.Client runs the loop for you.
Register the callbacks the server might ask for (elicitation_callback, sampling_callback, list_roots_callback) and call the tool. When an InputRequiredResult arrives, Client dispatches each entry in input_requests to the matching callback, retries with the answers and the echoed request_state, and keeps going until a CallToolResult comes back:
--8<-- "docs_src/mrtr/tutorial003.py"
elicitation_callback is the same one a pre-2026 server's back-channel elicitation/create would have hit. The same is true of sampling_callback for sampling/createMessage and list_roots_callback for roots/list: at 2026-07-28 the standalone server->client RPCs are gone, but the identical ElicitRequest / CreateMessageRequest / ListRootsRequest payloads ride inside input_requests and dispatch to the same three callbacks. One set of callbacks serves both eras.call_tool returns a plain CallToolResult. The intermediate rounds are invisible to the caller.get_prompt and read_resource drive the same loop.!!! check
Leave the callback off and the loop fails on the first round: the SDK's stand-in callback
answers every elicitation with an error, and call_tool raises MCPError with the message
"Elicitation not supported".
The loop is bounded. Client(..., input_required_max_rounds=10) is the default cap; a server that keeps returning InputRequiredResult past it makes call_tool raise. If a round carries only request_state and no input_requests, Client sleeps briefly (50ms doubling to a 250ms ceiling) before retrying, so a server that is just saying "not done yet" isn't busy-polled.
The auto-loop is enough for a single-process client. Own the loop instead when:
call_tool, so a different worker issues the retry. request_state is the persistable token you carry across that boundary, through your own storage, and input_responses is what the other side sends back with it.input_requests entry, refuse certain request kinds, or apply your own backoff between legs.anyio.fail_after(...) instead of relying on input_required_max_rounds.Drop to the underlying session, where allow_input_required=True hands you the union directly:
--8<-- "docs_src/mrtr/tutorial002.py"
client.session.call_tool(..., allow_input_required=True) widens the return type to CallToolResult | InputRequiredResult. The isinstance is what narrows it back.request_state is now in your hands. Write it down between legs and the conversation can resume from a fresh process.input_requests you put an InputResponse under the same key in input_responses. fulfil is where your UI goes; this one hard-codes the answer.arguments, every leg. The retry is the original call carried out again, not a new method.requestStateEverything above treats request_state as an echo, and on the wire that is all it is. But the client holds it between legs (writing it down across processes is exactly what the previous section blessed), so what comes back is client-supplied input: it can be modified, expired, or lifted from a different call entirely. The spec requires servers to integrity-protect this state and reject the round when verification fails, whenever the state can influence authorization, resource access, or business logic.
MCPServer protects it by default. Every server seals outgoing requestState and verifies every echo — resolver state and hand-built state alike — under a key generated at process start. You configure nothing, write plaintext, and read plaintext; the wire only ever carries an opaque encrypted token.
The default key lives and dies with the process, which is the one thing you must know before deploying beyond a single process:
from mcp.server.mcpserver import MCPServer, RequestStateSecurity
# Multi-instance or restart-surviving: one or more shared secret keys (>= 32 bytes each).
mcp = MCPServer("fleet", request_state_security=RequestStateSecurity(keys=[key]))
keys=[...] is required whenever a retry can reach a different instance (multi-worker uvicorn, load-balanced HTTP) or must survive restarts: every instance verifies what any sibling minted. Same machinery, your secret instead of a generated one.RequestStateSecurity(codec=...) instead of keys; Bring your own crypto below covers the contract.Default or configured, requestState on the wire is an encrypted, authenticated token. Your code never sees it: handlers and resolvers write plaintext and read plaintext (ctx.request_state); the SDK seals on the way out and verifies on the way in. Beyond integrity, each token is bound to:
RequestStateSecurity(ttl=...) (default 600 seconds) bounds per-round think time, not the whole flow.RequestStateSecurity(bind_principal=...) supplies one from your own identity signal. Whichever components your token verifier supplies, it must supply them consistently: a verifier that includes the subject on some requests and omits it on others changes the principal mid-flow, and in-flight rounds are rejected.All of that is the SDK's job, not yours, and not the codec's if you bring your own.
keys[0] seals new state; every key in the list verifies. Zero-downtime rotation is three phases, each fully rolled out before the next:
RequestStateSecurity(keys=[OLD, NEW]) # 1: every instance learns to verify NEW; OLD still mints
RequestStateSecurity(keys=[NEW, OLD]) # 2: NEW mints; in-flight OLD state keeps verifying
RequestStateSecurity(keys=[NEW]) # 3: one ttl after phase 2 is fully out, retire OLD
Never promote the minter first: minting under a key some instance can't yet verify drops in-flight rounds mid-rollout.
Keys are scoped to one service. The sealed envelope also carries the server's name as an audience claim, so a token minted by a different service that happens to share a secret is rejected anyway. The claim is only as distinctive as the name, so a server given an explicit policy must have a real name or set RequestStateSecurity(audience=...) — an unnamed one raises at construction. audience= also serves deliberate multi-service topologies where one service must accept state another minted. (The no-configuration default is exempt: its key never leaves the process, so the audience claim has nothing to add.)
RequestStateSecurity(codec=...) takes anything with seal(bytes) -> str and unseal(str) -> bytes that raises InvalidRequestState for any token it did not mint. The classic shape is envelope encryption against a KMS, where you unwrap a data key once at startup and keep the per-token crypto local:
--8<-- "docs_src/mrtr/tutorial005.py"
TTL, principal binding, and request binding are not the codec's job: the SDK stamps them into the payload before seal and re-verifies them after unseal, for every codec. A codec's only obligations are integrity (tampered means raise) and, ideally, confidentiality.
Every inbound failure, whether tampered, expired, replayed against a different request or principal, or sealed under a key this server doesn't know, gets the same answer:
{"code": -32602, "message": "Invalid or expired requestState"}
One frozen message for every cause, so the wire never reveals which check failed; the real reason goes to the server log. Every inbound requestState on tools/call, prompts/get, and resources/read is checked, including one arriving for a handler that never mints state. The most common rejection in practice isn't an attacker — it's the default process-local key meeting a retry from before a restart or from another instance; the client restarts the flow, and keys=[...] is the fix when that matters.
A request_state you set yourself (returning InputRequiredResult from a tool, prompt, or resource-template function) is sealed and verified by the same machinery as resolver state, with zero code changes: write plaintext, read plaintext, and every binding above applies.
The one thing the SDK cannot pin for you, even when configured, is question identity: it doesn't know which of your questions an answer in your state belongs to. If you store answers keyed by question, include your own question identifier in the state and check it on the retry.
The low-level Server is the no-batteries tier: unlike MCPServer, nothing is sealed until you append the boundary yourself, and your request_state crosses the wire exactly as written until you do. The one-line opt-in is shown in The low-level Server.
InputRequiredResult only exists at protocol version 2026-07-28. The in-memory Client(server) negotiates it for you; over the wire, mode="auto" discovers it. After connecting, client.protocol_version tells you what you got.
!!! warning
A pre-2026 session has nowhere to put an InputRequiredResult. Return one from your handler on a
mode="legacy" connection and the runner cannot serialize it into the negotiated version; the
client gets back a -32603 "Handler returned an invalid result" error. A server that serves
both eras must check ctx.protocol_version before reaching for it.
!!! info
URL-mode elicitation rides this exact mechanism on a 2026 connection. The entry in
input_requests is an ElicitRequest whose params are ElicitRequestURLParams; the user
finishes the out-of-band flow and your client retries the call. Same loop, no new API. The
high-level server half is in Elicitation.
InputRequiredResult. It never opens a request to the client.input_requests is what it needs. request_state is an opaque resume token only the server reads.Client runs the retry loop for you: register elicitation_callback / sampling_callback / list_roots_callback and call_tool returns a plain CallToolResult. input_required_max_rounds (default 10) bounds it.client.session.call_tool(..., allow_input_required=True) and own the while isinstance(result, InputRequiredResult) loop yourself.@mcp.tool(), a dependency that asks the user produces this result for you (Dependencies); the low-level Server is the manual form.@mcp.prompt() or template @mcp.resource() function returns the InputRequiredResult itself and reads ctx.input_responses on the retry.requestState comes back as client-supplied input, so MCPServer seals it by default — resolver state and hand-built state alike — under a process-local key; multi-instance deployments pass RequestStateSecurity(keys=[...]) (or a custom codec) so every instance can verify what a sibling minted. The seal binds every token to a time window, the originating request, and the authenticated principal when the request carries auth the SDK validated or bind_principal= supplies your own identity signal (Protecting requestState).This is the mechanism that replaces server-initiated sampling and the rest of the push-style back-channel; see Deprecated features.