Back to Go Micro

Agent provider conformance matrix

internal/docs/AGENT_CONFORMANCE.md

6.3.102.1 KB
Original Source

Agent provider conformance matrix

go test ./... includes TestAgentProviderConformanceMatrix, a shared agent scenario that runs against every registered chat provider. The scenario asks an agent to call a deterministic local tool, verifies the tool receives ai.RunInfo, and checks the final response carries the conformance marker. A fake provider path runs on every machine without network access so CI always exercises the harness.

Live providers are opt-in to avoid flaky unauthenticated PR checks and accidental API spend. To run the live matrix, set GO_MICRO_AGENT_CONFORMANCE_LIVE=1 plus the provider API keys you want to exercise:

ProviderRequired API keyOptional model override
OpenAIOPENAI_API_KEYGO_MICRO_CONFORMANCE_OPENAI_MODEL
AnthropicANTHROPIC_API_KEYGO_MICRO_CONFORMANCE_ANTHROPIC_MODEL
Atlas CloudATLASCLOUD_API_KEYGO_MICRO_CONFORMANCE_ATLASCLOUD_MODEL
GeminiGEMINI_API_KEYGO_MICRO_CONFORMANCE_GEMINI_MODEL
GroqGROQ_API_KEYGO_MICRO_CONFORMANCE_GROQ_MODEL
MistralMISTRAL_API_KEYGO_MICRO_CONFORMANCE_MISTRAL_MODEL
TogetherTOGETHER_API_KEYGO_MICRO_CONFORMANCE_TOGETHER_MODEL

When GO_MICRO_AGENT_CONFORMANCE_LIVE or a provider key is absent, the live provider subtest reports a deterministic skip. When both are present, a provider failure is a real test failure because drift in chat, tool calling, run metadata, or final-answer behavior means the services → agents lifecycle is no longer consistent across providers.

The companion TestAgentProviderConformanceFakeError keeps provider error propagation covered locally without relying on external credentials.

Scheduled CI

The daily/manual Harness (E2E) workflow runs the same matrix with GO_MICRO_AGENT_CONFORMANCE_LIVE=1 and the provider secrets exported. Providers whose keys are absent still skip cleanly, while any configured provider must pass the shared tool-calling scenario. This keeps scheduled conformance key-gated: PR checks stay deterministic and no-key environments remain green, but maintained provider credentials exercise the live matrix regularly.