apps/opik-documentation/documentation/fern/docs-v2/tracing/concepts.mdx
Tracing is the foundation of observability in Opik. It allows you to monitor, debug, and optimize your LLM applications by capturing detailed information about their execution. Understanding these core concepts is essential for effectively using Opik's tracing capabilities.
When working with LLM applications, understanding what's happening under the hood is crucial for debugging issues, optimizing performance, and ensuring reliability. Opik's tracing system provides comprehensive observability by capturing detailed execution information at multiple levels.
In order to effectively use Opik's tracing capabilities, it's important to understand these key concepts:
A trace represents a complete execution path for a single interaction with an LLM or agent. Think of it as a detailed record of everything that happened during one request-response cycle. Each trace captures the full context of the interaction, including inputs, outputs, timing, and any intermediate steps.
A span represents an individual operation or step within a trace. While a trace shows the complete picture, spans break down the execution into granular, measurable components. This hierarchical structure allows you to understand both the high-level flow and the detailed operations within your LLM application.
Trace: "Customer Support Chat"
├── Span: "Parse User Intent"
├── Span: "Query Knowledge Base"
│ ├── Span: "Search Vector Database"
│ └── Span: "Rank Results"
├── Span: "Generate Response"
│ ├── Span: "LLM Call: GPT-4"
│ └── Span: "Post-process Response"
└── Span: "Log Interaction"
A thread is a collection of related traces that form a coherent conversation or workflow. Threads are essential for understanding multi-turn interactions and maintaining context across multiple LLM calls. They provide a way to group related traces together, making it easier to analyze conversational patterns and user journeys.
Threads are created by defining a thread_id and referencing it in your traces. This allows you to:
Metrics provide quantitative assessments of your AI models' outputs, enabling objective comparisons and performance tracking over time. They are essential for understanding how well your LLM applications are performing and identifying areas for improvement.
Optimization is the systematic process of refining and evaluating LLM prompts and configurations to improve performance. It involves iteratively testing different approaches and using data-driven insights to make improvements.
Evaluation provides a framework for systematically testing your prompts and models against datasets using various metrics to measure performance. It's the foundation for making data-driven decisions about your LLM applications.
Now that you understand the core concepts, explore these resources to dive deeper: