litellm/integrations/websearch_interception/ARCHITECTURE.md
Server-side WebSearch tool execution for models that don't natively support it (e.g., Bedrock/Claude).
User makes ONE litellm.messages.acreate() call → Gets final answer with search results.
The agentic loop happens transparently on the server.
LiteLLM defines a standard web search tool format (litellm_web_search) that all native provider tools are converted to. This enables consistent interception across providers.
Standard Tool Definition (defined in tools.py):
{
"name": "litellm_web_search",
"description": "Search the web for information...",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query"}
},
"required": ["query"]
}
}
Tool Name Constant: LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search" (defined in litellm/constants.py)
The interception system automatically detects and handles:
| Tool Format | Example | Provider | Detection Method | Future-Proof |
|---|---|---|---|---|
| LiteLLM Standard | name="litellm_web_search" | Any | Direct name match | N/A |
| Anthropic Native | type="web_search_20250305" | Bedrock, Claude API | Type prefix: startswith("web_search_") | ✅ Yes (web_search_2026, etc.) |
| Claude Code CLI | name="web_search", type="web_search_20250305" | Claude Code | Name + type check | ✅ Yes (version-agnostic) |
| Legacy | name="WebSearch" | Custom | Name match | N/A (backwards compat) |
Future Compatibility: The startswith("web_search_") check in tools.py automatically supports future Anthropic web search versions.
Claude Code (Anthropic's official CLI) sends web search requests using Anthropic's native tool format:
{
"type": "web_search_20250305",
"name": "web_search",
"max_uses": 8
}
What Happens:
web_search_20250305 tool to LiteLLM proxylitellm_web_search standard formattool_use block for litellm_web_search (not server_tool_use)tool_uselitellm.asearch() using configured provider (Perplexity, Tavily, etc.)Without Interception: Bedrock would receive native tool → try to execute natively → return web_search_tool_result_error with invalid_tool_input
With Interception: LiteLLM converts → Bedrock returns tool_use → LiteLLM executes search → Returns final answer ✅
Native tools are converted to LiteLLM standard format before sending to the provider:
Conversion Point (litellm/llms/anthropic/experimental_pass_through/messages/handler.py):
anthropic_messages() function (lines 60-127)is_web_search_tool()litellm_web_search format using get_litellm_web_search_tool()web_search_tool_result_error)Response Detection (transformation.py):
tool_use blocks with any web search tool namelitellm_web_search, WebSearch, web_searchExample Conversion:
# Input (Claude Code's native tool)
{
"type": "web_search_20250305",
"name": "web_search",
"max_uses": 8
}
# Output (LiteLLM standard)
{
"name": "litellm_web_search",
"description": "Search the web for information...",
"input_schema": {...}
}
User manually handles tool execution:
litellm.messages.acreate() → Gets tool_use responselitellm.asearch()litellm.messages.acreate() again with resultsResult: 2 API calls, manual tool execution
Server handles tool execution automatically:
sequenceDiagram
participant User
participant Messages as litellm.messages.acreate()
participant Handler as llm_http_handler.py
participant Logger as WebSearchInterceptionLogger
participant Router as proxy_server.llm_router
participant Search as litellm.asearch()
participant Provider as Bedrock API
User->>Messages: acreate(tools=[WebSearch])
Messages->>Handler: async_anthropic_messages_handler()
Handler->>Provider: Request
Provider-->>Handler: Response (tool_use)
Handler->>Logger: async_should_run_agentic_loop()
Logger->>Logger: Detect WebSearch tool_use
Logger-->>Handler: (True, tools)
Handler->>Logger: async_run_agentic_loop(tools)
Logger->>Router: Get search_provider from search_tools
Router-->>Logger: search_provider
Logger->>Search: asearch(query, provider)
Search-->>Logger: Search results
Logger->>Logger: Build tool_result message
Logger->>Messages: acreate() with results
Messages->>Provider: Request with search results
Provider-->>Messages: Final answer
Messages-->>Logger: Final response
Logger-->>Handler: Final response
Handler-->>User: Final answer (with search results)
Result: 1 API call from user, server handles agentic loop
| Component | File | Purpose |
|---|---|---|
| WebSearchInterceptionLogger | handler.py | CustomLogger that implements agentic loop hooks |
| Tool Standardization | tools.py | Standard tool definition, detection, and utilities |
| Tool Name Constant | constants.py | LITELLM_WEB_SEARCH_TOOL_NAME = "litellm_web_search" |
| Tool Conversion | anthropic/.../ handler.py | Converts native tools to LiteLLM standard before API call |
| Transformation Logic | transformation.py | Detect tool_use, build tool_result messages, format search responses |
| Agentic Loop Hooks | integrations/custom_logger.py | Base hooks: async_should_run_agentic_loop(), async_run_agentic_loop() |
| Hook Orchestration | llms/custom_httpx/llm_http_handler.py | _call_agentic_completion_hooks() - calls hooks after response |
| Router Search Tools | proxy/proxy_server.py | llm_router.search_tools - configured search providers |
| Search Endpoints | proxy/search_endpoints/endpoints.py | Router logic for selecting search provider |
from litellm.integrations.websearch_interception import (
WebSearchInterceptionLogger,
get_litellm_web_search_tool,
)
from litellm.types.utils import LlmProviders
# Enable for Bedrock with specific search tool
litellm.callbacks = [
WebSearchInterceptionLogger(
enabled_providers=[LlmProviders.BEDROCK],
search_tool_name="my-perplexity-tool" # Optional: uses router's first tool if None
)
]
# Make request with LiteLLM standard tool (recommended)
response = await litellm.messages.acreate(
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=[{"role": "user", "content": "What is LiteLLM?"}],
tools=[get_litellm_web_search_tool()], # LiteLLM standard
max_tokens=1024,
stream=True # Auto-converted to non-streaming
)
# OR send native tools - they're auto-converted to LiteLLM standard
response = await litellm.messages.acreate(
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=[{"role": "user", "content": "What is LiteLLM?"}],
tools=[{
"type": "web_search_20250305", # Native Anthropic format
"name": "web_search",
"max_uses": 8
}],
max_tokens=1024,
)
WebSearch interception works transparently with both streaming and non-streaming requests.
How streaming is handled:
stream=True and WebSearch toolanthropic_messages() detects WebSearch + interception enabledstream=True → stream=False internallyWhy this approach:
Testing:
test_websearch_interception_e2e.pytest_websearch_interception_streaming_e2e.pysearch_tool_name specified → Look up in llm_router.search_toolsperplexityExample router config:
search_tools:
- search_tool_name: "my-perplexity-tool"
litellm_params:
search_provider: "perplexity"
- search_tool_name: "my-tavily-tool"
litellm_params:
search_provider: "tavily"
messages = [{"role": "user", "content": "What is LiteLLM?"}]
tools = [{"name": "WebSearch", ...}]
Response: tool_use with name="WebSearch", input={"query": "what is litellm"}
litellm.asearch(query="what is litellm", search_provider="perplexity")"Title: LiteLLM Docs\nURL: docs.litellm.ai\n..."messages = [
{"role": "user", "content": "What is LiteLLM?"},
{"role": "assistant", "content": [{"type": "tool_use", ...}]},
{"role": "user", "content": [{"type": "tool_result", "content": "search results..."}]}
]
response.content[0].text
# "Based on the search results, LiteLLM is a unified interface..."
E2E Tests:
test_websearch_interception_e2e.py - Non-streaming real API calls to Bedrocktest_websearch_interception_streaming_e2e.py - Streaming real API calls to BedrockUnit Tests: test_websearch_interception.py
Mocked tests for tool detection, provider filtering, edge cases.