Assertion Helpers Reference

Complete guide to assertLlmObsSpanEvent() and mock matchers for validating LLMObs span events.

assertLlmObsSpanEvent

Main assertion function for validating LLMObs span structure. Only the fields you specify are checked — unspecified fields are ignored.

See the docstring in packages/dd-trace/test/llmobs/util.js for the full type signature and parameter details.

Mock Matchers

Use these for non-deterministic values (output text, token counts, errors).

Matcher	Matches	Example Use Case
`MOCK_STRING`	Any non-empty string	Output message content (varies per run)
`MOCK_NOT_NULLISH`	Any truthy value	Token counts (exist but vary)
`MOCK_NUMBER`	Any number	Specific numeric metrics
`MOCK_OBJECT`	Any object	Error objects

Usage:

javascript

const { MOCK_STRING, MOCK_NOT_NULLISH, MOCK_NUMBER, MOCK_OBJECT } = require('../../util')

assertLlmObsSpanEvent(span, {
  outputMessages: [{ content: MOCK_STRING, role: 'assistant' }],
  metrics: { input_tokens: MOCK_NOT_NULLISH }
})

Common Patterns

1. Basic Chat Completion

javascript

assertLlmObsSpanEvent(events[0], {
  spanKind: 'llm',
  name: 'openai.chat.completions',
  modelName: 'gpt-4',
  modelProvider: 'openai',
  inputMessages: [{ content: 'Hello', role: 'user' }],
  outputMessages: [{ content: MOCK_STRING, role: 'assistant' }],
  metrics: {
    input_tokens: MOCK_NOT_NULLISH,
    output_tokens: MOCK_NOT_NULLISH,
    total_tokens: MOCK_NOT_NULLISH
  },
  metadata: { temperature: 0.7 }
})

2. Multi-Turn Conversation

javascript

assertLlmObsSpanEvent(events[0], {
  spanKind: 'llm',
  inputMessages: [
    { content: 'Hello', role: 'user' },
    { content: 'Hi!', role: 'assistant' },
    { content: 'How are you?', role: 'user' }
  ],
  outputMessages: [{ content: MOCK_STRING, role: 'assistant' }]
})

3. Workflow/Orchestration Span

javascript

assertLlmObsSpanEvent(events[0], {
  spanKind: 'workflow',  // Not 'llm'!
  name: 'langgraph.graph.invoke'
  // Workflows may not have inputMessages/outputMessages
})

4. Error Case

javascript

assertLlmObsSpanEvent(events[0], {
  spanKind: 'llm',
  outputMessages: [{ content: '', role: '' }],  // Empty on error
  error: MOCK_OBJECT
})

5. Partial Validation

Only specified fields are checked (others ignored):

javascript

assertLlmObsSpanEvent(events[0], {
  spanKind: 'llm',
  modelName: 'gpt-4'
  // inputMessages, outputMessages, metrics, metadata not validated
})

Best Practices

Use MOCK_ for non-deterministic values:*
- Output text: MOCK_STRING (real responses vary)
- Token counts: MOCK_NOT_NULLISH (counts vary but should exist)
- Error objects: MOCK_OBJECT (error details vary)
Use exact values for inputs:
- Input messages: You control these in tests
- Model parameters: You set these (temperature, max_tokens)
- Model name: You specify this
Always validate core fields:
- spanKind (required for every span)
- name (operation identifier)
- modelName and modelProvider (for LLM spans)
Validate message format:
- Ensure {content: string, role: string} structure
- Check role values: 'user', 'assistant', 'system', 'tool'
Test error paths:
- Verify empty outputMessages: [{content: '', role: ''}] on errors
- Assert error field exists with MOCK_OBJECT
Match span kind to operation:
- Chat/completions → spanKind: 'llm'
- Workflow execution → spanKind: 'workflow'
- Agent runs → spanKind: 'agent'
- Tool calls → spanKind: 'tool'
- Embeddings → spanKind: 'embedding'

Reference Test Implementation

For a complete, real-world example of how tests using these helpers are structured, see:

packages/datadog-plugin-anthropic/test/llmobs.spec.js (LLM_CLIENT / MULTI_PROVIDER pattern)
packages/datadog-plugin-google-genai/test/llmobs.spec.js (LLM_CLIENT pattern)
packages/dd-trace/test/llmobs/plugins/langgraph/index.spec.js (ORCHESTRATION pattern)

Field Reference Quick Lookup

Required:

spanKind - Always required

LLM Spans:

name, modelName, modelProvider, inputMessages, outputMessages, metrics, metadata

Workflow Spans:

name (may not have messages/metrics)

Agent Spans:

name (may have messages for agent I/O)

Tool Spans:

name (may have input/output for tool calls)

Embedding Spans:

name, modelName, modelProvider, metrics (input/output token counts)

Retrieval Spans:

name, metadata (query, results count, etc.)