cookbook/90_models/inception/README.md
Inception builds Mercury, a family of
diffusion large language models (dLLMs) that refine all tokens in parallel
instead of generating them left-to-right, making them very fast. Inception
exposes the models through an
OpenAI-compatible API, so you can drive
them through Agno the same way you'd drive any OpenAI-compatible provider.
The Agno Inception class defaults to mercury-2 and points at
https://api.inceptionlabs.ai/v1.
See the repository Development setup.
https://platform.inceptionlabs.ai/dashboard/api-keys).export INCEPTION_API_KEY=***
uv pip install -U openai ddgs agno
python cookbook/90_models/inception/basic.py
| Model id | Notes |
|---|---|
mercury-2 | Flagship reasoning dLLM. Tunable reasoning depth, 128K context, native tool use, JSON output. Default in the Agno class. |
mercury-coder-small | Coding-focused variant for latency-sensitive code workflows. |
The original
mercurymodel is only available to accounts created before February 24, 2026. New accounts should usemercury-2(or the Edit/coder variants) instead.
Pass any of these as Inception(id="..."):
from agno.agent import Agent
from agno.models.inception import Inception
agent = Agent(model=Inception(id="mercury-2"))
| Example | What it shows |
|---|---|
basic.py | Sync, sync+streaming, async, and async+streaming runs. |
tool_use.py | Agent calling a tool (web search), with streaming. |
structured_output.py | Pydantic-typed output via JSON mode. |
Inception's OpenAI-compatible endpoint does not implement native
json_schema structured outputs, so the Agno class sets
supports_native_structured_outputs = False. Use use_json_mode=True on the
agent for Pydantic-shaped output:
agent = Agent(
model=Inception(id="mercury-2"),
output_schema=MovieScript,
use_json_mode=True,
)
A full example lives in structured_output.py.
If you need a different host (private deployment, regional endpoint, etc.),
pass base_url:
Inception(id="mercury-2", base_url="https://your-host.example.com/v1")