.agents/skills/llmobs-testing/references/assertion-helpers.md
Complete guide to assertLlmObsSpanEvent() and mock matchers for validating LLMObs span events.
Main assertion function for validating LLMObs span structure. Only the fields you specify are checked — unspecified fields are ignored.
See the docstring in packages/dd-trace/test/llmobs/util.js for the full type signature and parameter details.
Use these for non-deterministic values (output text, token counts, errors).
| Matcher | Matches | Example Use Case |
|---|---|---|
MOCK_STRING | Any non-empty string | Output message content (varies per run) |
MOCK_NOT_NULLISH | Any truthy value | Token counts (exist but vary) |
MOCK_NUMBER | Any number | Specific numeric metrics |
MOCK_OBJECT | Any object | Error objects |
Usage:
const { MOCK_STRING, MOCK_NOT_NULLISH, MOCK_NUMBER, MOCK_OBJECT } = require('../../util')
assertLlmObsSpanEvent(span, {
outputMessages: [{ content: MOCK_STRING, role: 'assistant' }],
metrics: { input_tokens: MOCK_NOT_NULLISH }
})
assertLlmObsSpanEvent(events[0], {
spanKind: 'llm',
name: 'openai.chat.completions',
modelName: 'gpt-4',
modelProvider: 'openai',
inputMessages: [{ content: 'Hello', role: 'user' }],
outputMessages: [{ content: MOCK_STRING, role: 'assistant' }],
metrics: {
input_tokens: MOCK_NOT_NULLISH,
output_tokens: MOCK_NOT_NULLISH,
total_tokens: MOCK_NOT_NULLISH
},
metadata: { temperature: 0.7 }
})
assertLlmObsSpanEvent(events[0], {
spanKind: 'llm',
inputMessages: [
{ content: 'Hello', role: 'user' },
{ content: 'Hi!', role: 'assistant' },
{ content: 'How are you?', role: 'user' }
],
outputMessages: [{ content: MOCK_STRING, role: 'assistant' }]
})
assertLlmObsSpanEvent(events[0], {
spanKind: 'workflow', // Not 'llm'!
name: 'langgraph.graph.invoke'
// Workflows may not have inputMessages/outputMessages
})
assertLlmObsSpanEvent(events[0], {
spanKind: 'llm',
outputMessages: [{ content: '', role: '' }], // Empty on error
error: MOCK_OBJECT
})
Only specified fields are checked (others ignored):
assertLlmObsSpanEvent(events[0], {
spanKind: 'llm',
modelName: 'gpt-4'
// inputMessages, outputMessages, metrics, metadata not validated
})
Use MOCK_ for non-deterministic values:*
MOCK_STRING (real responses vary)MOCK_NOT_NULLISH (counts vary but should exist)MOCK_OBJECT (error details vary)Use exact values for inputs:
Always validate core fields:
spanKind (required for every span)name (operation identifier)modelName and modelProvider (for LLM spans)Validate message format:
{content: string, role: string} structure'user', 'assistant', 'system', 'tool'Test error paths:
outputMessages: [{content: '', role: ''}] on errorserror field exists with MOCK_OBJECTMatch span kind to operation:
spanKind: 'llm'spanKind: 'workflow'spanKind: 'agent'spanKind: 'tool'spanKind: 'embedding'For a complete, real-world example of how tests using these helpers are structured, see:
packages/datadog-plugin-anthropic/test/llmobs.spec.js (LLM_CLIENT / MULTI_PROVIDER pattern)packages/datadog-plugin-google-genai/test/llmobs.spec.js (LLM_CLIENT pattern)packages/dd-trace/test/llmobs/plugins/langgraph/index.spec.js (ORCHESTRATION pattern)Required:
spanKind - Always requiredLLM Spans:
name, modelName, modelProvider, inputMessages, outputMessages, metrics, metadataWorkflow Spans:
name (may not have messages/metrics)Agent Spans:
name (may have messages for agent I/O)Tool Spans:
name (may have input/output for tool calls)Embedding Spans:
name, modelName, modelProvider, metrics (input/output token counts)Retrieval Spans:
name, metadata (query, results count, etc.)