docs/features/routing-modes.md
MCPProxy supports three routing modes that control how upstream MCP tools are exposed to AI agents. Each mode offers different tradeoffs between token efficiency, flexibility, and simplicity.
| Mode | Tool Exposure | Best For |
|---|---|---|
retrieve_tools | BM25 search via retrieve_tools + call_tool_read/write/destructive | Large tool sets (50+ tools), token-sensitive workloads |
direct | All tools exposed as serverName__toolName | Small tool sets, simple setups, maximum compatibility |
code_execution | code_execution + retrieve_tools for discovery | Multi-step orchestration, reduced round-trips |
The configured mode is served on the default
/mcpendpoint. Each mode also has a dedicated endpoint (/mcp/call,/mcp/all,/mcp/code) — see Dedicated Endpoints below.
Set the routing mode in your config file (~/.mcpproxy/mcp_config.json):
{
"routing_mode": "retrieve_tools"
}
| Field | Type | Default | Values |
|---|---|---|---|
routing_mode | string | "retrieve_tools" | retrieve_tools, direct, code_execution |
The configured routing mode determines which tools are served on the default /mcp endpoint. All three modes are always available on dedicated endpoints regardless of the config setting.
Each routing mode has a dedicated MCP endpoint that is always available:
| Endpoint | Mode | Description |
|---|---|---|
/mcp/call | retrieve_tools | BM25 search via retrieve_tools + call_tool variants |
/mcp/all | direct | All upstream tools with serverName__toolName naming |
/mcp/code | code_execution | JavaScript orchestration via code_execution tool |
/mcp | (configured) | Serves whichever mode is set in routing_mode |
This means you can point different AI clients at different endpoints. For example, Claude Desktop at /mcp (retrieve_tools mode for token savings) and a CI/CD agent at /mcp/all (direct mode for simplicity).
The default mode uses BM25 full-text search to help AI agents discover relevant tools without exposing the entire tool catalog. This approach — sometimes called "lazy tool loading" or "tool search" — is used by Anthropic's own MCP implementation and is the recommended pattern for large tool sets.
Endpoint: /mcp/call
Tools exposed:
retrieve_tools — Search for tools by natural language querycall_tool_read — Execute read-only tool callscall_tool_write — Execute write tool callscall_tool_destructive — Execute destructive tool callsread_cache — Access paginated responsesHow it works:
retrieve_tools with a natural language querycall_with recommendationscall_tool_* variant with the exact tool name from resultsPros:
Cons:
When to use:
Direct mode exposes every upstream tool directly to the AI agent. Each tool appears in the standard MCP tools/list with a serverName__toolName name. This is the simplest mode and the closest to how individual MCP servers work natively.
Endpoint: /mcp/all
Tools exposed:
serverName__toolName (e.g., github__create_issue, filesystem__read_file)How it works:
tools/listserverName__toolName nameTool naming:
__ (double underscore) to avoid conflicts with single underscores in tool names__ only, so tool names containing __ are preserved[serverName] for clarityAuth enforcement:
Pros:
Cons:
When to use:
Code execution mode is designed for multi-step orchestration workflows. Instead of making separate tool calls for each step, the AI agent writes JavaScript or TypeScript code that chains multiple tool calls together in a single request. This is inspired by patterns from OpenAI's code interpreter and similar "tool-as-code" approaches.
Endpoint: /mcp/code
Tools exposed:
code_execution — Execute JavaScript/TypeScript that orchestrates upstream tools (includes a catalog of all available tools in the description)retrieve_tools — Search for tools (instructs to use code_execution, not call_tool_*)How it works:
code_execution tool with a listing of all available upstream toolscall_tool(serverName, toolName, args)Pros:
Cons:
"enable_code_execution": true)When to use:
Note: Code execution must be enabled in config ("enable_code_execution": true). If disabled, the code_execution tool appears but returns an error message directing the user to enable it.
| Factor | retrieve_tools | direct | code_execution |
|---|---|---|---|
| Token cost | Low (only matched tools) | High (all tools) | Medium (catalog in description) |
| Round-trips per task | 2 (search + call) | 1 (direct call) | 1 (code handles all) |
| Max practical tools | 500+ | ~50 | 500+ |
| Setup complexity | None (default) | None | Requires enablement |
| Model requirements | Any modern LLM | Any LLM | Code-capable LLM |
| Audit granularity | High (intent metadata) | Medium (annotations) | Medium (code logged) |
| IDE auto-approve | Per-variant rules | Per-tool rules | Single rule |
# Status command shows routing mode
mcpproxy status
# Doctor command shows routing mode in Security Features section
mcpproxy doctor
# Get routing mode info
curl -H "X-API-Key: your-key" http://127.0.0.1:8080/api/v1/routing
# Response:
{
"routing_mode": "retrieve_tools",
"description": "BM25 search via retrieve_tools + call_tool variants (default)",
"endpoints": {
"default": "/mcp",
"direct": "/mcp/all",
"code_execution": "/mcp/code",
"retrieve_tools": "/mcp/call"
},
"available_modes": ["retrieve_tools", "direct", "code_execution"]
}
# Status endpoint also includes routing_mode
curl -H "X-API-Key: your-key" http://127.0.0.1:8080/api/v1/status
Edit ~/.mcpproxy/mcp_config.json:
{
"routing_mode": "direct"
}
Restart MCPProxy:
pkill mcpproxy
mcpproxy serve
The routing mode is applied at startup and determines which MCP server instance handles the default /mcp endpoint. Dedicated endpoints (/mcp/all, /mcp/code, /mcp/call) are always available regardless of the configured mode.
When upstream servers connect, disconnect, or update their tools:
notifications/tools/list_changed notification.code_execution description is refreshed.enable_code_execution: true) and runs code in a sandboxed JavaScript VM with no filesystem or network access.