apps/opik-documentation/documentation/fern/docs/evaluation/update_existing_experiment.mdx
You can update existing experiments in several ways:
You can update an experiment's name and configuration from both the Opik UI and the SDKs.
To update an experiment from the UI:
The configuration is stored as JSON and is useful for tracking parameters like model names, temperatures, prompt templates, or any other metadata relevant to your experiment.
Use the update_experiment method to update an experiment's name and configuration:
import opik
client = opik.Opik()
# Update experiment name
client.update_experiment(
id="experiment-id",
name="Updated Experiment Name"
)
# Update experiment configuration
client.update_experiment(
id="experiment-id",
experiment_config={
"model": "gpt-4",
"temperature": 0.7,
"prompt_template": "Answer the following question: {question}"
}
)
# Update both name and configuration
client.update_experiment(
id="experiment-id",
name="Updated Experiment Name",
experiment_config={
"model": "gpt-4",
"temperature": 0.7
}
)
Use the updateExperiment method to update an experiment's name and configuration:
import { Opik } from "opik";
const opik = new Opik();
// Update experiment name
await opik.updateExperiment("experiment-id", {
name: "Updated Experiment Name"
});
// Update experiment configuration
await opik.updateExperiment("experiment-id", {
experimentConfig: {
model: "gpt-4",
temperature: 0.7,
promptTemplate: "Answer the following question: {question}"
}
});
// Update both name and configuration
await opik.updateExperiment("experiment-id", {
name: "Updated Experiment Name",
experimentConfig: {
model: "gpt-4",
temperature: 0.7
}
});
Sometimes you may want to update an existing experiment with new scores, or update existing scores for an experiment. You can do this using the evaluate_experiment function.
This function will re-run the scoring metrics on the existing experiment items and update the scores:
from opik.evaluation import evaluate_experiment
from opik.evaluation.metrics import Hallucination
hallucination_metric = Hallucination()
# Replace "my-experiment" with the name of your experiment which can be found in the Opik UI
evaluate_experiment(experiment_name="my-experiment", scoring_metrics=[hallucination_metric])
Suppose you are building a chatbot and want to compute the hallucination scores for a set of example conversations. For this you would create a first experiment with the evaluate function:
import opik
from opik import Opik, track
from opik.evaluation import evaluate
from opik.evaluation.metrics import Hallucination
from opik.integrations.openai import track_openai
import openai
opik.configure(project_name="my-project")
# Define the task to evaluate
openai_client = track_openai(openai.OpenAI())
MODEL = "gpt-3.5-turbo"
@track
def your_llm_application(input: str) -> str:
response = openai_client.chat.completions.create(
model=MODEL,
messages=[{"role": "user", "content": input}],
)
return response.choices[0].message.content
# Define the evaluation task
def evaluation_task(x):
return {
"output": your_llm_application(x['input'])
}
# Create a simple dataset
client = Opik()
dataset = client.get_or_create_dataset(name="Existing experiment dataset", project_name="my-project")
dataset.insert([
{"input": "What is the capital of France?"},
{"input": "What is the capital of Germany?"},
])
# Define the metrics
hallucination_metric = Hallucination()
evaluation = evaluate(
experiment_name="Existing experiment example",
dataset=dataset,
task=evaluation_task,
scoring_metrics=[hallucination_metric],
project_name="my-project",
experiment_config={
"model": MODEL
}
)
experiment_name = evaluation.experiment_name
print(f"Experiment name: {experiment_name}")
<Tip>Learn more about the evaluate function in our LLM evaluation guide.</Tip>
Once the first experiment is created, you realise that you also want to compute a moderation score for each example. You could re-run the experiment with new scoring metrics but this means re-running the output. Instead, you can simply update the experiment with the new scoring metrics:
from opik.evaluation import evaluate_experiment
from opik.evaluation.metrics import Moderation
moderation_metric = Moderation()
evaluate_experiment(experiment_name="already_existing_experiment", scoring_metrics=[moderation_metric])