Simulation

The Opik simulation module provides tools for creating multi-turn conversation simulations between simulated users and your applications. This is particularly useful for evaluating agent behavior over multiple conversation turns.

.. toctree:: :maxdepth: 1

SimulatedUser run_simulation

Overview

Multi-turn simulation allows you to:

Simulate realistic user interactions with your agent over multiple conversation turns
Generate context-aware user responses based on conversation history
Evaluate agent behavior across extended conversations
Test different user personas and scenarios systematically

Key Components

SimulatedUser: A class that generates realistic user responses using LLMs or predefined responses.

run_simulation: A function that orchestrates multi-turn conversations between a simulated user and your application.

Basic Usage

Here's a simple example of how to use the simulation module:

.. code-block:: python

from opik.simulation import SimulatedUser, run_simulation from opik import track

Create a simulated user

user_simulator = SimulatedUser( persona="You are a frustrated customer who wants a refund", model="openai/gpt-5-nano" )

Define your agent

@track def my_agent(user_message: str, *, thread_id: str, **kwargs): # Your agent logic here return {"role": "assistant", "content": "I can help you with that..."}

Run the simulation

simulation = run_simulation( app=my_agent, user_simulator=user_simulator, max_turns=5 )

print(f"Thread ID: {simulation['thread_id']}") print(f"Conversation: {simulation['conversation_history']}")

Integration with Evaluation

Simulations work seamlessly with Opik's evaluation framework:

.. code-block:: python

from opik.evaluation import evaluate_threads from opik.evaluation.metrics import ConversationThreadMetric

Run multiple simulations

simulations = [] for persona in ["frustrated_user", "happy_customer", "confused_user"]: simulator = SimulatedUser(persona=f"You are a {persona}") simulation = run_simulation( app=my_agent, user_simulator=simulator, max_turns=5 ) simulations.append(simulation)

Evaluate the threads

results = evaluate_threads( project_name="my_project", filter_string='tags contains "simulation"', metrics=[ConversationThreadMetric()] )

For more detailed examples and advanced usage patterns, see the individual component documentation.