docs/multi_agent.md
Orchestration refers to the flow of agents in your app. Which agents run, in what order, and how do they decide what happens next? There are two main ways to orchestrate agents:
You can mix and match these patterns. Each has their own tradeoffs, described below.
An agent is an LLM equipped with instructions, tools and handoffs. This means that given an open-ended task, the LLM can autonomously plan how it will tackle the task, using tools to take actions and acquire data, and using handoffs to delegate tasks to sub-agents. For example, a research agent could be equipped with tools like:
In the Python SDK, two orchestration patterns come up most often:
| Pattern | How it works | Best when |
|---|---|---|
| Agents as tools | A manager agent keeps control of the conversation and calls specialist agents through Agent.as_tool(). | You want one agent to own the final answer, combine outputs from multiple specialists, or enforce shared guardrails in one place. |
| Handoffs | A triage agent routes the conversation to a specialist, and that specialist becomes the active agent for the rest of the turn. | You want the specialist to respond directly, keep prompts focused, or swap instructions without the manager narrating the result. |
Use agents as tools when a specialist should help with a bounded subtask but should not take over the user-facing conversation. Use handoffs when routing itself is part of the workflow and you want the chosen specialist to own the next part of the interaction.
You can also combine the two. A triage agent might hand off to a specialist, and that specialist can still call other agents as tools for narrow subtasks.
This pattern is great when the task is open-ended and you want to rely on the intelligence of an LLM. The most important tactics here are:
If you want the core SDK primitives behind this style of orchestration, start with tools, handoffs, and running agents.
While orchestrating via LLM is powerful, orchestrating via code makes tasks more deterministic and predictable, in terms of speed, cost and performance. Common patterns here are:
while loop with an agent that evaluates and provides feedback, until the evaluator says the output passes certain criteria.asyncio.gather. This is useful for speed when you have multiple tasks that don't depend on each other.We have a number of examples in examples/agent_patterns.
Agent.as_tool() and manager-style orchestration.