content/providers/05-observability/confident-ai.mdx
Confident AI is an LLM observability and evaluation platform for teams to build reliable AI applications in both development and production.
The deepeval-ts package integrates with the AI SDK's experimental_telemetry API to provide tracing, online evaluations, and session analytics.
To enable tracing, install deepeval-ts, configure your API key, and initialize a tracer using configureAiSdkTracing.
npm install deepeval-ts
Sign up or log in to Confident AI to get your API key, then set it as an environment variable:
CONFIDENT_API_KEY="YOUR-PROJECT-API-KEY"
Import and call configureAiSdkTracing to create a tracer:
import { configureAiSdkTracing } from 'deepeval-ts';
const tracer = configureAiSdkTracing();
You can now pass the tracer object into the experimental_telemetry field of any AI SDK call to get your traces on the Confident AI platform.
Here are some of the examples on how to trace various AI SDK functions using Confident AI's tracer:
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing } from 'deepeval-ts';
const tracer = configureAiSdkTracing();
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'What are LLMs?',
experimental_telemetry: {
isEnabled: true,
tracer,
},
});
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing } from 'deepeval-ts';
const tracer = configureAiSdkTracing();
const result = streamText({
model: openai('gpt-4o'),
prompt: 'Invent a new holiday and describe its traditions.',
experimental_telemetry: {
isEnabled: true,
tracer,
},
});
for await (const textPart of result.textStream) {
console.log(textPart);
}
import { generateText, tool, stepCountIs } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing } from 'deepeval-ts';
import { z } from 'zod';
const tracer = configureAiSdkTracing();
const result = await generateText({
model: openai('gpt-4o'),
tools: {
weather: tool({
description: 'Get the weather in a location',
inputSchema: z.object({
location: z.string().describe('The location to get the weather for'),
}),
execute: async ({ location }) => ({
location,
temperature: 72 + Math.floor(Math.random() * 21) - 10,
}),
}),
},
stopWhen: stepCountIs(5),
prompt: 'What is the weather in San Francisco?',
experimental_telemetry: {
isEnabled: true,
tracer,
},
});
import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing } from 'deepeval-ts';
import { z } from 'zod';
const tracer = configureAiSdkTracing();
const { object } = await generateObject({
model: openai('gpt-4o'),
schema: z.object({
recipe: z.object({
name: z.string(),
ingredients: z.array(z.object({ name: z.string(), amount: z.string() })),
steps: z.array(z.string()),
}),
}),
prompt: 'Generate a lasagna recipe.',
experimental_telemetry: {
isEnabled: true,
tracer,
},
});
The following examples show traces generated from the snippets above:
You can customize trace grouping and evaluation behavior by passing options to configureAiSdkTracing. This allows you to:
You can pass attributes like name, threadId, userId and environment to make it easier to find and filter your traces.
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing } from 'deepeval-ts';
const tracer = configureAiSdkTracing({
name: 'AI SDK Confident AI Tracing',
threadId: 'thread-123',
userId: 'user-456',
environment: 'production',
});
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'How do you make the best coffee?',
experimental_telemetry: {
isEnabled: true,
tracer: tracer,
},
});
If you use Confident AI Prompt Management, you can associate traces with a specific prompt version. Pass a Prompt object to configureAiSdkTracing to associate your traces with the prompt version used at runtime.
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing, Prompt } from 'deepeval-ts';
const prompt = new Prompt({ alias: 'my-prompt-alias' });
await prompt.pull();
const tracer = configureAiSdkTracing({
confidentPrompt: prompt,
});
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'How do you make the best coffee?',
experimental_telemetry: {
isEnabled: true,
tracer: tracer,
},
});
Logging prompts allows you to monitor what prompts are running in production and which ones are performing best overtime:
<Note> Make sure to **`pull` the prompt** before passing it to `configureAiSdkTracing`. Without pulling first, the prompt version will not be visible on Confident AI. </Note>Confident AI supports automatic online evaluation of your traces by passing a metric collection defined in your project. To enable online evaluations:
configureAiSdkTracing optionsHere's an example of how to attach metric collections to your traces:
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { configureAiSdkTracing } from 'deepeval-ts';
const tracer = configureAiSdkTracing({
metricCollection: 'my-trace-metrics',
llmMetricCollection: 'my-llm-metrics',
toolMetricCollection: 'my-tool-metrics',
});
const { text } = await generateText({
model: openai('gpt-4o'),
prompt: 'How do you make the best coffee?',
experimental_telemetry: {
isEnabled: true,
tracer: tracer,
},
});
All incoming traces will now be evaluated automatically. Evaluation results are visible in the Confident AI Observatory alongside your traces.
You can find a more comprehensive guide on AI SDK tracing with deepeval-ts in the Confident AI docs here.