apps/opik-documentation/documentation/fern/docs/tracing/export_data.mdx
When working with Opik, it is important to be able to export traces, spans, and threads so that you can use them to fine-tune your models or run deeper analysis.
You can export the data you have logged to the Opik platform using:
Opik.search_traces, Opik.search_spans, and Opik.search_threads) or the TypeScript SDK method (client.searchTraces()) to export traces, spans, and threads./traces and /spans endpoints to export traces and spans.Export CSV button in the Actions dropdown.The Python SDK Opik.search_traces method and TypeScript SDK client.searchTraces() method allow you to both export all the traces in a project or search for specific traces and export them.
To export all traces, you will need to specify a max_results / maxResults value that is higher than the total number of traces in your project:
client = opik.Opik()
traces = client.search_traces(project_name="Default project", max_results=1000000)
```
const client = new Opik();
const traces = await client.searchTraces({
projectName: "Default project",
maxResults: 1000000
});
```
You can use the filter_string (Python) / filterString (TypeScript) parameter to search for specific traces:
client = opik.Opik()
traces = client.search_traces(
project_name="Default project",
filter_string='input contains "Opik"'
)
# Convert to Dict if required
traces = [trace.dict() for trace in traces]
```
const client = new Opik();
const traces = await client.searchTraces({
projectName: "Default project",
filterString: 'input contains "Opik"'
});
```
All search methods (search_traces, search_spans, and search_threads) accept a filter_string (Python) / filterString (TypeScript) parameter that uses Opik Query Language (OQL):
"<COLUMN> <OPERATOR> <VALUE> [AND <COLUMN> <OPERATOR> <VALUE>]*"
Rules:
AND (OR is not supported)"2024-01-01T00:00:00Z")metadata.model, feedback_scores.accuracyEach entity type supports a different set of filter columns. The tables below list the available columns for each.
| Column | Type | Operators |
|---|---|---|
id | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
name | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
input, output | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
thread_id | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
guardrails | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
experiment_id | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
start_time, end_time | DateTime | =, !=, >, >=, <, <= |
created_at, last_updated_at | DateTime | =, !=, >, >=, <, <= |
metadata | Dictionary | =, !=, contains, not_contains, starts_with, ends_with, >, >=, <, <= |
input_json, output_json | Dictionary | =, !=, contains, not_contains, starts_with, ends_with, >, >=, <, <= |
feedback_scores | Numeric | =, !=, >, >=, <, <=, is_empty, is_not_empty |
span_feedback_scores | Numeric | =, !=, >, >=, <, <=, is_empty, is_not_empty |
tags | List | =, !=, contains, not_contains, is_empty, is_not_empty |
annotation_queue_ids | List | =, !=, contains, not_contains, is_empty, is_not_empty |
usage.total_tokens, usage.prompt_tokens, usage.completion_tokens | Numeric | =, !=, >, >=, <, <= |
duration, total_estimated_cost, llm_span_count | Numeric | =, !=, >, >=, <, <= |
error_info | Container | is_empty, is_not_empty |
| Column | Type | Operators |
|---|---|---|
id | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
name | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
input, output | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
model | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
provider | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
trace_id | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
type | Enum | =, != |
start_time, end_time | DateTime | =, !=, >, >=, <, <= |
metadata | Dictionary | =, !=, contains, not_contains, starts_with, ends_with, >, >=, <, <= |
input_json, output_json | Dictionary | =, !=, contains, not_contains, starts_with, ends_with, >, >=, <, <= |
feedback_scores | Numeric | =, !=, >, >=, <, <=, is_empty, is_not_empty |
tags | List | =, !=, contains, not_contains, is_empty, is_not_empty |
usage.total_tokens, usage.prompt_tokens, usage.completion_tokens | Numeric | =, !=, >, >=, <, <= |
duration, total_estimated_cost | Numeric | =, !=, >, >=, <, <= |
error_info | Container | is_empty, is_not_empty |
| Column | Type | Operators |
|---|---|---|
id | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
first_message, last_message | String | =, !=, contains, not_contains, starts_with, ends_with, >, < |
status | Enum | =, != |
start_time, end_time | DateTime | =, !=, >, >=, <, <= |
created_at, last_updated_at | DateTime | =, !=, >, >=, <, <= |
feedback_scores | Numeric | =, !=, >, >=, <, <=, is_empty, is_not_empty |
tags | List | =, !=, contains, not_contains, is_empty, is_not_empty |
annotation_queue_ids | List | =, !=, contains, not_contains, is_empty, is_not_empty |
duration, number_of_messages | Numeric | =, !=, >, >=, <, <= |
client = opik.Opik(project_name="Default project")
# Trace filters
traces = client.search_traces(filter_string='input contains "Opik"')
traces = client.search_traces(filter_string='start_time >= "2024-01-01T00:00:00Z"')
traces = client.search_traces(filter_string='usage.total_tokens > 1000')
traces = client.search_traces(filter_string='metadata.model = "gpt-4o"')
traces = client.search_traces(filter_string='feedback_scores.user_rating is_not_empty')
traces = client.search_traces(filter_string='tags contains "production"')
# Thread filters
threads = client.search_threads(filter_string='number_of_messages >= 5')
threads = client.search_threads(filter_string='first_message contains "hello"')
threads = client.search_threads(filter_string='status = "active"')
```
const client = new Opik({ projectName: "Default project" });
// Trace filters
const t1 = await client.searchTraces({ filterString: 'input contains "Opik"' });
const t2 = await client.searchTraces({ filterString: 'start_time >= "2024-01-01T00:00:00Z"' });
const t3 = await client.searchTraces({ filterString: 'usage.total_tokens > 1000' });
const t4 = await client.searchTraces({ filterString: 'metadata.model = "gpt-4o"' });
const t5 = await client.searchTraces({ filterString: 'feedback_scores.user_rating is_not_empty' });
const t6 = await client.searchTraces({ filterString: 'tags contains "production"' });
```
'feedback_scores."My Score" > 0'
If the feedback_scores key contains both spaces and double quotes, you will need to escape the double quotes as "":
'feedback_scores."Score ""with"" Quotes" > 0'
or by using different quotes, surrounding in triple-quotes, like this:
'''feedback_scores.'Accuracy "Happy Index"' < 0.8'''
You can export spans using the Opik.search_spans method. This method allows you to search for spans based on trace_id or based on a filter string.
trace_idTo export all the spans associated with a specific trace, you can use the trace_id parameter:
import opik
client = opik.Opik()
spans = client.search_spans(
project_name="Default project",
trace_id="067092dc-e639-73ff-8000-e1c40172450f"
)
You can use the filter_string parameter to search for specific spans:
import opik
client = opik.Opik()
spans = client.search_spans(
project_name="Default project",
filter_string='input contains "Opik"'
)
You can export threads using the Opik.search_threads method. This method allows you to search for conversational threads in a project.
To export all threads, you will need to specify a max_results value that is higher than the total number of threads in your project:
import opik
client = opik.Opik()
threads = client.search_threads(project_name="Default project", max_results=1000000)
You can use the filter_string parameter to search for specific threads:
import opik
client = opik.Opik()
# Search for a specific thread by ID
threads = client.search_threads(
project_name="Default project",
filter_string='id = "thread_123"'
)
# Search for threads with many messages
threads = client.search_threads(
project_name="Default project",
filter_string='number_of_messages >= 5'
)
# Search for threads with a specific feedback score
threads = client.search_threads(
project_name="Default project",
filter_string='feedback_scores.user_satisfaction > 0.8'
)
# Search for threads by tag
threads = client.search_threads(
project_name="Default project",
filter_string='tags contains "important"'
)
To export traces using the Opik REST API, you can use the /traces endpoint and the /spans endpoint. These endpoints are paginated so you will need to make multiple requests to retrieve all the traces or spans you want.
To search for specific traces or spans, you can use the filter parameter. While this is a string parameter, it does not follow the same format as the filter_string parameter in the Opik SDK. Instead it is a list of json objects with the following format:
[
{
"field": "name",
"type": "string",
"operator": "=",
"value": "Opik"
}
]
To export traces as a CSV file from the UI, you can simply select the traces or spans you wish to export and click on Export CSV in the Actions dropdown: