Observability for OpenAI Agents with Opik

OpenAI released an agentic framework aptly named Agents. What sets this framework apart from others is that it provides a rich set of core building blocks:

Models: Support for all OpenAI Models
Tools: Similar function calling functionality than the one available when using the OpenAI models directly
Knowledge and Memory: Seamless integration with OpenAI's vector store and Embeddings Anthropic
Guardrails: Run Guardrails checks in parallel to your agent execution which allows for secure execution without slowing down the total agent execution.

Opik's integration with Agents is just one line of code and allows you to analyse and debug the agent execution flow in our Open-Source platform.

Account Setup

Comet provides a hosted version of the Opik platform, simply create an account and grab your API Key.

You can also run the Opik platform locally, see the installation guide for more information.

Getting Started

Installation

First, ensure you have both opik and openai-agents packages installed:

bash

pip install opik openai-agents

Configuring Opik

Configure the Opik Python SDK for your deployment type. See the Python SDK Configuration guide for detailed instructions on:

CLI configuration: opik configure
Code configuration: opik.configure()
Self-hosted vs Cloud vs Enterprise setup
Configuration files and environment variables

Configuring OpenAI Agents

In order to use OpenAI Agents, you will need to configure your OpenAI API key. You can find or create your API keys in these pages:

You can set them as environment variables:

bash

export OPENAI_API_KEY="YOUR_API_KEY"

Or set them programmatically:

python

import os
import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Enabling logging to Opik

To enable logging to Opik, simply add the following two lines of code to your existing OpenAI Agents code:

python

import os
from agents import Agent, Runner
from agents import set_trace_processors
from opik.integrations.openai.agents import OpikTracingProcessor

# Set project name for better organization
os.environ["OPIK_PROJECT_NAME"] = "openai-agents-demo"

set_trace_processors(processors=[OpikTracingProcessor()])

agent = Agent(name="Assistant", instructions="You are a helpful assistant")

result = Runner.run_sync(agent, "Write a haiku about recursion in programming.")
print(result.final_output)

<Tip> The Opik integration will automatically track both the token usage and overall cost of each LLM call that is being made. You can also view this information aggregated for the entire agent execution. </Tip>

Example: Agents with Function Tools

You can create agents with custom function tools. The OpikTracingProcessor automatically captures all tool calls as well:

python

from agents import Agent, Runner, function_tool, set_trace_processors
from opik.integrations.openai.agents import OpikTracingProcessor

set_trace_processors(processors=[OpikTracingProcessor()])

@function_tool
def calculate_average(numbers: list[float]) -> float:
    return sum(numbers) / len(numbers)

@function_tool  
def get_recommendation(topic: str, user_level: str) -> str:
    recommendations = {
        "python": {
            "beginner": "Start with Python.org's tutorial, then try Python Crash Course book. Practice with simple scripts and built-in functions.",
            "intermediate": "Explore frameworks like Flask/Django, learn about decorators, context managers, and dive into Python's data structures.",
            "advanced": "Study Python internals, contribute to open source, learn about metaclasses, and explore performance optimization."
        },
        "machine learning": {
            "beginner": "Start with Andrew Ng's Coursera course, learn basic statistics, and try scikit-learn with simple datasets.",
            "intermediate": "Dive into deep learning with TensorFlow/PyTorch, study different algorithms, and work on real projects.",
            "advanced": "Research latest papers, implement algorithms from scratch, and contribute to ML frameworks."
        }
    }
    
    topic_lower = topic.lower()
    level_lower = user_level.lower()
    
    if topic_lower in recommendations and level_lower in recommendations[topic_lower]:
        return recommendations[topic_lower][level_lower]
    else:
        return f"For {topic} at {user_level} level: Focus on fundamentals, practice regularly, and build projects to apply your knowledge."

def create_advanced_agent():
    """Create an advanced agent with tools and comprehensive instructions."""
    instructions = """
    You are an expert programming tutor and learning advisor. You have access to tools that help you:
    1. Calculate averages for performance metrics, grades, or other numerical data
    2. Provide personalized learning recommendations based on topics and user experience levels
    
    Your role:
    - Help users learn programming concepts effectively
    - Provide clear, beginner-friendly explanations when needed
    - Use your tools when appropriate to give concrete help
    - Offer structured learning paths and resources
    - Be encouraging and supportive
    
    When users ask about:
    - Programming languages: Use get_recommendation to provide tailored advice
    - Performance or scores: Use calculate_average if numbers are involved
    - Learning paths: Combine your knowledge with tool-based recommendations
    
    Always explain your reasoning and make your responses educational.
    """
    
    return Agent(
        name="AdvancedProgrammingTutor",
        instructions=instructions,
        model="gpt-4o-mini",
        tools=[calculate_average, get_recommendation]
    )

# Create and use the advanced agent
advanced_agent = create_advanced_agent()

# Example queries
queries = [
    "I'm new to Python programming. Can you tell me about it?",
    "I got these test scores: 85, 92, 78, 96, 88. What's my average and how am I doing?",
    "I know some Python basics but want to learn machine learning. What should I do next?",
]

for i, query in enumerate(queries, 1):
    print(f"\n📝 Query {i}: {query}")
    result = Runner.run_sync(advanced_agent, query)
    print(f"🤖 Response: {result.final_output}")
    print("=" * 80)

Adding granularity with the `@track` decorator

If you need more visibility into what happens inside your tool functions, you can use the @track decorator to trace specific steps within the tool execution:

python

from agents import Agent, Runner, function_tool, set_trace_processors
from opik.integrations.openai.agents import OpikTracingProcessor
from opik import track

set_trace_processors(processors=[OpikTracingProcessor()])

@track(name="fetch_user_data")
def fetch_user_data(user_id: str) -> dict:
    # This step will be traced separately
    return {"user_id": user_id, "preferences": ["python", "ml"]}

@track(name="generate_recommendations")
def generate_recommendations(preferences: list) -> str:
    # This step will also be traced separately
    return f"Based on your interests in {', '.join(preferences)}, we recommend..."

@function_tool
def get_personalized_advice(user_id: str) -> str:
    """Get personalized learning advice for a user."""
    # Each tracked function call inside the tool will appear as a separate span
    user_data = fetch_user_data(user_id)
    recommendations = generate_recommendations(user_data["preferences"])
    return recommendations

agent = Agent(
    name="PersonalizedTutor",
    instructions="Help users with personalized learning advice.",
    model="gpt-4o-mini",
    tools=[get_personalized_advice]
)

result = Runner.run_sync(agent, "Give me learning advice for user_123")
print(result.final_output)

Logging threads

When you are running multi-turn conversations with OpenAI Agents using OpenAI Agents trace API, Opik integration automatically use the trace group_id as the Thread ID so you can easily review conversation inside Opik. Here is an example below:

python

async def main():
    agent = Agent(name="Assistant", instructions="Reply very concisely.")

    thread_id = str(uuid.uuid4())

    with trace(workflow_name="Conversation", group_id=thread_id):
        # First turn
        result = await Runner.run(agent, "What city is the Golden Gate Bridge in?")
        print(result.final_output)
        # San Francisco

        # Second turn
        new_input = result.to_input_list() + [{"role": "user", "content": "What state is it in?"}]
        result = await Runner.run(agent, new_input)
        print(result.final_output)
        # California

Further improvements

OpenAI Agents is still a relatively new framework and we are working on a couple of improvements:

Improved rendering of the inputs and outputs for the LLM calls as part of our Pretty Mode functionality
Improving the naming conventions for spans
Adding the agent execution input and output at a trace level

If there are any additional improvements you would like us to make, feel free to open an issue on our GitHub repository.

Account Setup

Getting Started

Installation

Configuring Opik

Configuring OpenAI Agents

Enabling logging to Opik

Example: Agents with Function Tools

Adding granularity with the @track decorator

Logging threads

Further improvements

Adding granularity with the `@track` decorator