Observability for LangChain (Python) with Opik - Opik

<Note> In Opik 2.0, datasets and experiments are project-scoped. Make sure to specify a `project_name` when creating datasets and running experiments so they are associated with the correct project. </Note>

Opik provides seamless integration with LangChain, allowing you to easily log and trace your LangChain-based applications. By using the OpikTracer callback, you can automatically capture detailed information about your LangChain runs, including inputs, outputs, metadata, and cost tracking for each step in your chain.

Key Features

Automatic cost tracking for supported LLM providers (OpenAI, Anthropic, Google AI, AWS Bedrock, and more)
Full compatibility with the @opik.track decorator for hybrid tracing approaches
Thread support for conversational applications with thread_id parameter
Distributed tracing support for multi-service applications
LangGraph compatibility for complex graph-based workflows
Evaluation and testing support for automated LLM application testing

Account Setup

Comet provides a hosted version of the Opik platform, simply create an account and grab your API Key.

You can also run the Opik platform locally, see the installation guide for more information.

Getting Started

Installation

To use the OpikTracer with LangChain, you'll need to have both the opik and langchain packages installed. You can install them using pip:

bash

pip install opik langchain langchain_openai

Configuring Opik

Configure the Opik Python SDK for your deployment type. See the Python SDK Configuration guide for detailed instructions on:

CLI configuration: opik configure
Code configuration: opik.configure()
Self-hosted vs Cloud vs Enterprise setup
Configuration files and environment variables

Using OpikTracer

Here's a basic example of how to use the OpikTracer callback with a LangChain chain:

python

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from opik.integrations.langchain import OpikTracer

# Initialize the tracer
opik_tracer = OpikTracer(project_name="langchain-examples")

llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = ChatPromptTemplate.from_messages([
    ("human", "Translate the following text to French: {text}")
])
chain = prompt | llm

result = chain.invoke(
    {"text": "Hello, how are you?"},
    config={"callbacks": [opik_tracer]}
)
print(result.content)

The OpikTracer will automatically log the run and its details to Opik, including the input prompt, the output, and metadata for each step in the chain.

For detailed parameter information, see the OpikTracer SDK reference.

Practical Example: Text-to-SQL with Evaluation

Let's walk through a real-world example of using LangChain with Opik for a text-to-SQL query generation task. This example demonstrates how to create synthetic datasets, build LangChain chains, and evaluate your application.

Setting up the Environment

First, let's set up our environment with the necessary dependencies:

python

import os
import getpass
import opik
from opik.integrations.openai import track_openai
from openai import OpenAI

# Configure Opik
opik.configure(use_local=False)
os.environ["OPIK_PROJECT_NAME"] = "langchain-integration-demo"

# Set up API keys
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Creating a Synthetic Dataset

We'll create a synthetic dataset of questions for our text-to-SQL task:

python

import json
from langchain_community.utilities import SQLDatabase

# Download and set up the Chinook database
import requests

url = "https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite"
filename = "./data/chinook/Chinook_Sqlite.sqlite"

folder = os.path.dirname(filename)
if not os.path.exists(folder):
    os.makedirs(folder)

if not os.path.exists(filename):
    response = requests.get(url)
    with open(filename, "wb") as file:
        file.write(response.content)
    print("Chinook database downloaded")

db = SQLDatabase.from_uri(f"sqlite:///{filename}")

# Create synthetic questions using OpenAI
client = OpenAI()
openai_client = track_openai(client)

prompt = """
Create 20 different example questions a user might ask based on the Chinook Database.
These questions should be complex and require the model to think. They should include complex joins and window functions to answer.
Return the response as a json object with a "result" key and an array of strings with the question.
"""

completion = openai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": prompt}]
)

synthetic_questions = json.loads(completion.choices[0].message.content)["result"]

# Create dataset in Opik
opik_client = opik.Opik()
dataset = opik_client.get_or_create_dataset(name="synthetic_questions", project_name="my-project")
dataset.insert([{"question": question} for question in synthetic_questions])

Building the LangChain Chain

Now let's create a LangChain chain for SQL query generation:

python

from langchain.chains import create_sql_query_chain
from langchain_openai import ChatOpenAI
from opik.integrations.langchain import OpikTracer

# Create the LangChain chain with OpikTracer
opik_tracer = OpikTracer(tags=["sql_generation"])

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
chain = create_sql_query_chain(llm, db).with_config({"callbacks": [opik_tracer]})

# Test the chain
response = chain.invoke({"question": "How many employees are there?"})
print(response)

Evaluating the Application

Let's create a custom evaluation metric and test our application:

python

import opik
from opik import track
from opik.evaluation import evaluate
from opik.evaluation.metrics import base_metric, score_result
from typing import Any

opik.configure(project_name="my-project")

class ValidSQLQuery(base_metric.BaseMetric):
    def __init__(self, name: str, db: Any):
        self.name = name
        self.db = db

    def score(self, output: str, **ignored_kwargs: Any):
        try:
            db.run(output)
            return score_result.ScoreResult(
                name=self.name, value=1, reason="Query ran successfully"
            )
        except Exception as e:
            return score_result.ScoreResult(name=self.name, value=0, reason=str(e))

# Set up evaluation
valid_sql_query = ValidSQLQuery(name="valid_sql_query", db=db)
dataset = opik_client.get_dataset("synthetic_questions")

@track()
def llm_chain(input: str) -> str:
    response = chain.invoke({"question": input})
    return response

def evaluation_task(item):
    response = llm_chain(item["question"])
    return {"output": response}

# Run evaluation
res = evaluate(
    experiment_name="SQL question answering",
    dataset=dataset,
    task=evaluation_task,
    scoring_metrics=[valid_sql_query],
    nb_samples=20,
    project_name="my-project",
)

The evaluation results are now uploaded to the Opik platform and can be viewed in the UI.

Cost Tracking

The OpikTracer automatically tracks token usage and cost for all supported LLM models used within LangChain applications.

Cost information is automatically captured and displayed in the Opik UI, including:

Token usage details
Cost per request based on model pricing
Total trace cost

<Tip> View the complete list of supported models and providers on the [Supported Models](/tracing/supported_models) page. </Tip>

For streaming with cost tracking, ensure stream_usage=True is set:

python

from langchain_openai import ChatOpenAI
from opik.integrations.langchain import OpikTracer

llm = ChatOpenAI(
    model="gpt-4o",
    streaming=True,
    stream_usage=True,  # Required for cost tracking with streaming
)

opik_tracer = OpikTracer()

for chunk in llm.stream("Hello", config={"callbacks": [opik_tracer]}):
    print(chunk.content, end="")

<Tip> View the complete list of supported models and providers on the [Supported Models](/tracing/supported_models) page. </Tip>

Settings tags and metadata

You can customize the OpikTracer callback to include additional metadata, logging options, and conversation threading:

python

from opik.integrations.langchain import OpikTracer

opik_tracer = OpikTracer(
    tags=["langchain", "production"],
    metadata={"use-case": "customer-support", "version": "1.0"},
    thread_id="conversation-123",  # For conversational applications
    project_name="my-langchain-project"
)

Accessing logged traces

You can use the created_traces method to access the traces collected by the OpikTracer callback:

python

from opik.integrations.langchain import OpikTracer

opik_tracer = OpikTracer()

# Calling Langchain object
traces = opik_tracer.created_traces()
print([trace.id for trace in traces])

The traces returned by the created_traces method are instances of the Trace class, which you can use to update the metadata, feedback scores and tags for the traces.

Accessing the content of logged traces

In order to access the content of logged traces you will need to use the Opik.get_trace_content method:

python

import opik
from opik.integrations.langchain import OpikTracer
opik_client = opik.Opik()

opik_tracer = OpikTracer()


# Calling Langchain object

# Getting the content of the logged traces
traces = opik_tracer.created_traces()
for trace in traces:
    content = opik_client.get_trace_content(trace.id)
    print(content)

Updating and scoring logged traces

You can update the metadata, feedback scores and tags for traces after they are created. For this you can use the created_traces method to access the traces and then update them using the update method and the log_feedback_score method:

python

from opik.integrations.langchain import OpikTracer

opik_tracer = OpikTracer(project_name="langchain-examples")

# ... calling Langchain object

traces = opik_tracer.created_traces()

for trace in traces:
    trace.update(tags=["my-tag"])
    trace.log_feedback_score(name="user-feedback", value=0.5)

Compatibility with @track Decorator

The OpikTracer is fully compatible with the @track decorator, allowing you to create hybrid tracing approaches:

python

import opik
from langchain_openai import ChatOpenAI
from opik.integrations.langchain import OpikTracer

@opik.track
def my_langchain_workflow(user_input: str) -> str:
    llm = ChatOpenAI(model="gpt-4o")
    opik_tracer = OpikTracer()

    # The LangChain call will create spans within the existing trace
    response = llm.invoke(user_input, config={"callbacks": [opik_tracer]})
    return response.content

result = my_langchain_workflow("What is machine learning?")

Thread Support

Use the thread_id parameter to group related conversations or interactions:

python

from opik.integrations.langchain import OpikTracer

# All traces with the same thread_id will be grouped together
opik_tracer = OpikTracer(thread_id="user-session-123")

Distributed Tracing

For multi-service/thread/process applications, you can use distributed tracing headers to connect traces across services:

python

from opik import opik_context
from opik.integrations.langchain import OpikTracer
from opik.types import DistributedTraceHeadersDict

# In your service that receives distributed trace headers.
# The distributed_headers dict can be obtained in the "parent" service via `opik_context.get_distributed_trace_headers()`
distributed_headers = DistributedTraceHeadersDict(
    opik_trace_id="trace-id-from-upstream",
    opik_parent_span_id="parent-span-id-from-upstream"
)

opik_tracer = OpikTracer(distributed_headers=distributed_headers)

# LangChain operations will be attached to the existing distributed trace
chain.invoke(input_data, config={"callbacks": [opik_tracer]})

<Tip>Learn more about distributed tracing in the Distributed Tracing guide.</Tip>

LangGraph Integration

For LangGraph applications, Opik provides specialized support. The OpikTracer works seamlessly with LangGraph, and you can also visualize graph definitions:

python

from langgraph.graph import StateGraph
from opik.integrations.langchain import OpikTracer

# Your LangGraph setup
graph = StateGraph(...)
compiled_graph = graph.compile()

opik_tracer = OpikTracer()
result = compiled_graph.invoke(
    input_data,
    config={"callbacks": [opik_tracer]}
)

<Tip>For detailed LangGraph integration examples, see the LangGraph Integration guide.</Tip>

Advanced usage

The OpikTracer object has a flush method that can be used to make sure that all traces are logged to the Opik platform before you exit a script. This method will return once all traces have been logged or if the timeout is reach, whichever comes first.

python

from opik.integrations.langchain import OpikTracer

opik_tracer = OpikTracer()
opik_tracer.flush()

Important notes

Asynchronous streaming: If you are using asynchronous streaming mode (calling .astream() method), the input field in the trace UI may be empty due to a LangChain limitation for this mode. However, you can find the input data inside the nested spans of this chain.
Streaming with cost tracking: If you are planning to use streaming with LLM calls and want to calculate LLM call tokens/cost, you need to explicitly set stream_usage=True: