apps/opik-documentation/python-sdk-docs/source/simulation/run_simulation.rst
.. currentmodule:: opik.simulation
.. autofunction:: run_simulation
The run_simulation function orchestrates multi-turn conversation simulations between a simulated user and your application. It manages the conversation flow, tracks traces, and returns comprehensive results for evaluation.
@track if not already decorated.. code-block:: python
run_simulation( app: Callable, user_simulator: SimulatedUser, initial_message: Optional[str] = None, max_turns: int = 5, thread_id: Optional[str] = None, project_name: Optional[str] = None, **app_kwargs: Any ) -> Dict[str, Any]
app (Callable)
Your application function that processes messages. Must have signature:
app(message: str, *, thread_id: str, **kwargs) -> Dict[str, str]
The function will be automatically decorated with @track if not already decorated.
user_simulator (SimulatedUser)
Instance of SimulatedUser that generates user responses.
initial_message (str, optional)
Optional initial message from the user. If None, the simulator will generate one.
max_turns (int, optional) Maximum number of conversation turns. Defaults to 5.
thread_id (str, optional)
Thread ID for grouping traces. If None, a new ID will be generated.
project_name (str, optional) Project name for trace logging. Included in trace metadata.
app_kwargs (Any) Additional keyword arguments passed to the app function.
Dict[str, Any] Dictionary containing:
Your app function must follow this signature:
.. code-block:: python
def my_app(user_message: str, *, thread_id: str, **kwargs) -> Dict[str, str]: # Process the user message # Manage conversation history internally using thread_id # Return assistant response as message dict return {"role": "assistant", "content": "Your response"}
Key Requirements:
Basic Usage
.. code-block:: python
from opik.simulation import SimulatedUser, run_simulation
from opik import track
# Create a simulated user
user_simulator = SimulatedUser(
persona="You are a customer who wants help with a product",
model="openai/gpt-5-nano"
)
# Define your agent with conversation history management
agent_history = {}
@track
def customer_service_agent(user_message: str, *, thread_id: str, **kwargs):
if thread_id not in agent_history:
agent_history[thread_id] = []
# Add user message to history
agent_history[thread_id].append({"role": "user", "content": user_message})
# Process with full conversation context
messages = agent_history[thread_id]
# Your agent logic here (e.g., call LLM)
response = "I can help you with that. What specific issue are you experiencing?"
# Add assistant response to history
agent_history[thread_id].append({"role": "assistant", "content": response})
return {"role": "assistant", "content": response}
# Run the simulation
simulation = run_simulation(
app=customer_service_agent,
user_simulator=user_simulator,
max_turns=5,
project_name="customer_service_evaluation"
)
print(f"Thread ID: {simulation['thread_id']}")
print(f"Conversation length: {len(simulation['conversation_history'])}")
Custom Initial Message
.. code-block:: python
simulation = run_simulation( app=customer_service_agent, user_simulator=user_simulator, initial_message="I'm having trouble with my order", max_turns=3 )
Custom Thread ID
.. code-block:: python
# Use a custom thread ID for easier tracking
custom_thread_id = "simulation_test_001"
simulation = run_simulation(
app=customer_service_agent,
user_simulator=user_simulator,
thread_id=custom_thread_id,
max_turns=5
)
Multiple Simulations
.. code-block:: python
personas = [ "You are a frustrated customer who wants a refund", "You are a happy customer who wants to buy more", "You are a confused user who needs help with setup" ]
simulations = [] for i, persona in enumerate(personas): simulator = SimulatedUser(persona=persona) simulation = run_simulation( app=customer_service_agent, user_simulator=simulator, max_turns=5, project_name="multi_persona_evaluation" ) simulations.append(simulation) print(f"Simulation {i+1} completed: {simulation['thread_id']}")
Integration with Evaluation
.. code-block:: python
from opik.evaluation import evaluate_threads
from opik.evaluation.metrics import ConversationThreadMetric
# Run simulations
simulation = run_simulation(
app=customer_service_agent,
user_simulator=user_simulator,
max_turns=5,
project_name="evaluation_test"
)
# Evaluate the simulation thread
results = evaluate_threads(
project_name="evaluation_test",
filter_string=f'thread_id = "{simulation["thread_id"]}"',
metrics=[ConversationThreadMetric()]
)
Advanced Usage with Tags
~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
# Add custom tags and metadata to traces
simulation = run_simulation(
app=customer_service_agent,
user_simulator=user_simulator,
max_turns=5,
project_name="tagged_simulation",
simulation_id="test_001", # Custom parameter
tags=["simulation", "customer_service"] # Custom parameter
)
# Your app can access these parameters
@track
def tagged_agent(user_message: str, *, thread_id: str, simulation_id: str = None, tags: List[str] = None, **kwargs):
# Use simulation_id and tags for custom logic
if simulation_id:
print(f"Running simulation: {simulation_id}")
return {"role": "assistant", "content": "Response"}
Error Handling
~~~~~~~~~~~~~~
.. code-block:: python
@track
def error_prone_agent(user_message: str, *, thread_id: str, **kwargs):
# This might raise an exception
if "error" in user_message.lower():
raise ValueError("Simulated error")
return {"role": "assistant", "content": "Normal response"}
# run_simulation handles errors gracefully
simulation = run_simulation(
app=error_prone_agent,
user_simulator=user_simulator,
max_turns=3
)
# Errors are captured in the conversation history
for message in simulation['conversation_history']:
if "Error processing message" in message.get('content', ''):
print(f"Error occurred: {message['content']}")
Best Practices
--------------
1. **Thread Management**: Always use the provided ``thread_id`` to manage conversation history
2. **Error Handling**: Implement proper error handling in your app function
3. **Return Format**: Always return message dictionaries with 'role' and 'content' keys
4. **History Management**: Keep conversation history in a thread-safe way if running concurrent simulations
5. **Resource Management**: Be mindful of token usage with long conversations
6. **Testing**: Use fixed responses in SimulatedUser for deterministic testing
Notes
-----
- The function automatically decorates your app with ``@track`` if not already decorated
- All traces from a simulation are grouped under the same thread ID
- The function handles errors gracefully and continues the simulation
- Conversation history is returned as a list of message dictionaries
- Custom parameters passed via ``**app_kwargs`` are forwarded to your app function