docs/examples/retrievers/you_retriever.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/retrievers/you_retriever.ipynb" target="_parent"></a>
This notebook demonstrates how to use You.com's Search API as a retriever in LlamaIndex. The API automatically returns relevant web and/or news results based on your query. Visit our docs to learn more about our Search and other APIs: https://docs.you.com/
The retriever converts You.com's search results into LlamaIndex's standard format (NodeWithScore), allowing you to:
Running cells with '.venv (Python 3.13.9)' requires the ipykernel package. You may need to install it into your Python environment.
To get started, install the llama-index-retrievers-you package.
%pip install llama-index-retrievers-you
Get your API key from the You.com platform
import os
from getpass import getpass
# Set your API key
you_api_key = os.environ.get("YDC_API_KEY") or getpass(
"Enter your You.com API key: "
)
First, let's set up the retriever and see what data it returns:
from llama_index.retrievers.you import YouRetriever
retriever = YouRetriever(api_key=you_api_key)
retrieved_results = retriever.retrieve("national parks in the US")
print(f"Retrieved {len(retrieved_results)} results")
for i, result in enumerate(retrieved_results):
print(f"\nResult {i+1}:")
print(f" Text: {result.node.text}...")
print("Metadata:")
for key, value in result.node.metadata.items():
print(f" {key}: {value}")
The retriever also supports async operations.
from llama_index.retrievers.you import YouRetriever
retriever = YouRetriever(api_key=you_api_key)
# Use aretrieve for async operations
retrieved_results = await retriever.aretrieve("national parks in the US")
print(f"Retrieved {len(retrieved_results)} results asynchronously")
for i, result in enumerate(retrieved_results):
print(f"\nResult {i+1}:")
print(f" Text: {result.node.text}...")
print("Metadata:")
for key, value in result.node.metadata.items():
print(f" {key}: {value}")
The You.com API can also news results automatically, based on your query.
# News-related queries will include news results in the response
from typing import Any
# You should see at most 5 results per type - news and web
# Notice the source_type: "news" or "web"
retriever = YouRetriever(api_key=you_api_key, count=5, country="IN")
retrieved_results = retriever.retrieve(
"What are the latest geopolitical updates in India"
)
print(f"Retrieved {len(retrieved_results)} results")
for i, result in enumerate[Any](retrieved_results):
print(f"\nResult {i+1}:")
print(f" Text: {result.node.text}...")
print("Metadata:")
for key, value in result.node.metadata.items():
print(f" {key}: {value}")
You can customize the search with optional parameters:
retriever = YouRetriever(
api_key=you_api_key,
count=20, # Return up to 20 results per section (web/news)
country="US", # Focus on US results
language="en", # English results
freshness="week", # Results from the past week
safesearch="moderate", # Moderate safe search filtering
)
retrieved_results = retriever.retrieve("renewable energy breakthroughs")
print(f"Retrieved {len(retrieved_results)} recent results from the US")
for i, result in enumerate(retrieved_results):
print(f"\nResult {i+1}:")
print(f" Text: {result.node.text}...")
print("Metadata:")
for key, value in result.node.metadata.items():
print(f" {key}: {value}")
Now that we've seen how to customize the web data we want to retrieve, let's use an LLM to synthesize natural language answers from the search results. In this example, we'll use a model from Anthropic.
%pip install llama-index-llms-anthropic
import os
from getpass import getpass
# Set your Anthropic API key
anthropic_api_key = os.environ.get("ANTHROPIC_API_KEY") or getpass(
"Enter your Anthropic API key: "
)
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.llms.anthropic import Anthropic
from llama_index.core import Settings
from llama_index.retrievers.you import YouRetriever
# Configure Anthropic as your LLM
llm = Anthropic(model="claude-haiku-4-5-20251001", api_key=anthropic_api_key)
# Create a query engine that uses You.com search results as context
retriever = YouRetriever(api_key=you_api_key)
query_engine = RetrieverQueryEngine.from_args(retriever, llm)
# The query engine:
# 1. Uses the retriever to fetch relevant search results from You.com
# 2. Passes those results as context to the LLM
# 3. Returns a synthesized answer
response = query_engine.query(
"What are the most visited national parks in the US and why? keep it brief."
)
# Try a different query
# response = query_engine.query("What are the latest geopolitical updates from India")
print(str(response))
The retriever converts You.com's JSON response into LlamaIndex's standard NodeWithScore format. This provides:
Benefits:
What's preserved:
metadata dictThis abstraction lets you focus on building applications rather than handling API-specific response formats.