examples/integration-google-adk/README.md
This example shows how to evaluate the Python Google Agent Development Kit (ADK) in promptfoo with native ADK tracing.
It demonstrates:
adk api_server wrapperSequentialAgentnpx promptfoo@latest init --example integration-google-adk
cd integration-google-adk
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
export GOOGLE_API_KEY=your_google_api_key_here
npx promptfoo@latest eval -c promptfooconfig.yaml --no-cache
npx promptfoo@latest eval -c promptfooconfig.workflow.yaml --no-cache
npx promptfoo@latest view
The default model is gemini-2.5-flash. To use another ADK-supported model, set ADK_MODEL before running the eval. Provider-style model strings such as openai/gpt-5.4-mini require the optional ADK extensions:
pip install 'google-adk[extensions]>=1.32.0,<2'
export ADK_MODEL=openai/gpt-5.4-mini
If Promptfoo is launched outside the activated virtual environment, point the Python provider at it explicitly:
PROMPTFOO_PYTHON=.venv/bin/python npx promptfoo@latest eval -c promptfooconfig.yaml --no-cache
agent.py: ADK app builders, tools, callback, plugin, and workflow agent graphprovider.py: Promptfoo Python provider plus ADK-to-Promptfoo trace propagationprovider_test.py: focused tests for provider helperspromptfooconfig.yaml: conversational multi-turn eval with state, artifacts, and trajectory assertionspromptfooconfig.workflow.yaml: workflow-agent eval with SequentialAgentrequirements.txt: Python dependenciesThe main config turns one Promptfoo row into a small multi-turn task:
The provider returns the user-visible answer plus an inspection payload:
session_state from ADK stateartifact_names and artifacts from InMemoryArtifactServiceplugin_events recorded by an ADK BasePluginevent_count from the ADK sessionThe eval asserts that:
get_weather and save_trip_noteinvoke_agent, call_llm, and execute_tool spansADK 1.x already emits OpenTelemetry spans for the important framework steps:
invocationinvoke_agent <name>call_llmexecute_tool <name>provider.py keeps those spans inside Promptfoo's trace by:
traceparent from the Promptfoo Python provider contextBecause ADK records gen_ai.tool.name and tool-call arguments, Promptfoo can normalize those spans into trajectory:* assertions without a custom SDK span converter.
After an eval, open the Trace Timeline for the row and inspect:
invoke_agent weather_agentcall_llmexecute_tool get_weatherexecute_tool save_trip_notegen_ai.tool.namegcp.vertex.agent.tool_call_argsThe older HTTP shape around adk api_server is fine when you need to test a deployed service boundary, but it hides useful framework details from Promptfoo. The in-process provider is the better default when you want:
Use an HTTP provider when the deployed API itself is what you want to validate.