apps/opik-documentation/documentation/fern/docs-v2/development/optimization-runs/optimization_studio.mdx
Optimization Studio helps you improve prompts without writing code. You bring a prompt, define what “good” looks like, and Opik tests variations to find a better version you can ship with confidence. Teams like it because it shortens the loop from idea to evidence: you see scores and examples, not just a hunch. If you prefer a programmatic workflow, use the Optimize prompts guide.
<video src="/img/agent_optimization/optimization_studio_walkthrough.mp4" width="854" height="480" autoPlay muted loop playsInline preload="auto" />
An optimization run is a structured way to improve a prompt. Opik takes your current prompt, tries small variations, and scores each one so you can pick the best-performing version with evidence instead of guesswork.
<Frame> </Frame>Give the run a descriptive name so you can find it later. A good pattern is goal + dataset + date, for example “Support intent v1 - Jan 2026”.
Choose the model that will generate responses, then set the message roles (System, User, and so on). If your dataset has fields like question or answer, insert them with {{variable}} placeholders so each example flows into the prompt correctly. Start with the prompt you already use in production so improvements are easy to compare.
Choose how Opik should search for better prompts. GEPA works well for single-turn prompts and quick improvements, while HRPO is better when you need deeper analysis of why a prompt fails. If you are new, start with GEPA to get a quick baseline, then switch to HRPO if you need deeper insight. For technical details, see Optimization algorithms.
Pick an existing dataset to supply examples. Aim for diverse, real-world cases rather than edge cases only, and keep the first run small so you can iterate quickly. If you need to create or upload data first, see Manage datasets.
Pick how Opik should score each prompt. Use Equals if the output should match exactly, or G-Eval if you want a model to grade quality. When using G-Eval, make sure the grading prompt reflects what “good” means for your task.
Once the run starts, Optimization Studio shows the best score so far and a progress chart for each trial.
<Frame> </Frame>The Trials tab is where you compare prompt variations and scores, by clicking on a specific trial you can view the individual trial items that were evaluated.
<Frame> </Frame>You can rerun the same setup, cancel a run to change inputs, or select multiple runs to compare outcomes.
If you want to automate optimizations in code later, follow Optimize prompts and use the same dataset and metric from this run.
For a deeper breakdown of trials and traces, visit Dashboard results. If you want to automate this workflow, use Optimize prompts. To fine-tune your strategy, explore Optimization algorithms.