Back to Opik

๐Ÿš€ Gretel to Opik Integration: Creating Q&A Datasets for Model Evaluation

apps/opik-documentation/documentation/docs/cookbook/gretel_opik_integration_cookbook.ipynb

2.0.24-526214.3 KB
Original Source

๐Ÿš€ Gretel to Opik Integration: Creating Q&A Datasets for Model Evaluation

The Story: You need high-quality Q&A datasets to evaluate your AI models, but creating them manually is time-consuming and expensive. This cookbook shows you how to use Gretel's synthetic data generation to create diverse, realistic Q&A datasets and import them into Opik for model evaluation and optimization.

What you'll accomplish:

  1. Generate synthetic Q&A data using Gretel Data Designer
  2. Convert it to Opik format
  3. Import into Opik for model evaluation
  4. See your dataset in the Opik UI

๐Ÿ“‹ Prerequisites

  • Gretel Account: Sign up at gretel.ai and get your API key
  • Comet Account: Sign up at comet.com for Opik access

Let's get started! ๐ŸŽฏ

๐Ÿ› ๏ธ Two Approaches Available

This cookbook demonstrates two methods for generating synthetic data with Gretel:

  1. Data Designer (recommended for custom datasets): Create datasets from scratch with precise control
  2. Safe Synthetics (recommended for existing data): Generate synthetic versions of existing datasets

We'll start with Data Designer, then show Safe Synthetics as an alternative.

๐Ÿ’พ Step 1: Install Required Packages

We'll install the Gretel client and Opik SDK:

python
%pip install gretel-client opik pandas --upgrade --quiet

๐Ÿ” Step 2: Authentication Setup

Let's authenticate with both Gretel and Opik:

python
import os
import getpass
import opik
import pandas as pd

print("๐Ÿ” Setting up authentication...")

# Set up Gretel API key
if "GRETEL_API_KEY" not in os.environ:
    os.environ["GRETEL_API_KEY"] = getpass.getpass("Enter your Gretel API key: ")

# Set up Opik (will prompt for API key if not configured)
opik.configure()

print("โœ… Authentication completed!")

๐Ÿ“Š Step 3: Generate Q&A Dataset with Gretel Data Designer

Now we'll use Gretel Data Designer to generate synthetic Q&A data. We'll create questions and answers about AI and machine learning:

python
from gretel_client.navigator_client import Gretel  # Use navigator_client instead!
from gretel_client.data_designer import columns as C
from gretel_client.data_designer import params as P

print("๐Ÿค– Setting up Q&A dataset generation with Gretel Data Designer...")

# Initialize Data Designer using the navigator_client and factory method
gretel_navigator = Gretel()  # This creates the navigator client
dd = gretel_navigator.data_designer.new(model_suite="apache-2.0")

# Add topic column (categorical sampler)
dd.add_column(
    C.SamplerColumn(
        name="topic",
        type=P.SamplerType.CATEGORY,
        params=P.CategorySamplerParams(
            values=[
                "neural networks", "deep learning", "machine learning", "NLP", 
                "computer vision", "reinforcement learning", "AI ethics", "data science"
            ]
        )
    )
)

# Add difficulty column
dd.add_column(
    C.SamplerColumn(
        name="difficulty",
        type=P.SamplerType.CATEGORY,
        params=P.CategorySamplerParams(
            values=["beginner", "intermediate", "advanced"]
        )
    )
)

# Add question column (LLM-generated)
dd.add_column(
    C.LLMTextColumn(
        name="question",
        prompt=(
            "Generate a challenging, specific question about {{ topic }} "
            "at {{ difficulty }} level. The question should be clear, focused, "
            "and something a student or practitioner might actually ask."
        )
    )
)

# Add answer column (LLM-generated)
dd.add_column(
    C.LLMTextColumn(
        name="answer",
        prompt=(
            "Provide a clear, accurate, and comprehensive answer to this {{ difficulty }}-level "
            "question about {{ topic }}: '{{ question }}'. The answer should be educational "
            "and directly address all aspects of the question."
        )
    )
)

print("๐Ÿ“Š Generating Q&A dataset...")

# Generate the dataset
workflow_run = dd.create(num_records=20, wait_until_done=True)
synthetic_df = workflow_run.dataset.df

print(f"โœ… Generated {len(synthetic_df)} Q&A pairs!")
print(f"\n๐Ÿ“Š Dataset shape: {synthetic_df.shape}")
print(f"๐Ÿ“‹ Columns: {list(synthetic_df.columns)}")

# Display first few rows
print("\n๐Ÿ“„ Sample data:")
synthetic_df.head(3)

๐Ÿ”„ Step 4: Convert to Opik Format

Let's convert our Gretel-generated data to the format Opik expects:

python
def convert_to_opik_format(df):
    """Convert Gretel Q&A data to Opik dataset format"""
    opik_items = []
    
    for _, row in df.iterrows():
        # Create Opik dataset item
        item = {
            "input": {
                "question": row["question"]
            },
            "expected_output": row["answer"],
            "metadata": {
                "topic": row.get("topic", "AI/ML"),
                "difficulty": row.get("difficulty", "unknown"),
                "source": "gretel_navigator"
            }
        }
        opik_items.append(item)
    
    return opik_items

print("๐Ÿ”„ Converting to Opik format...")

opik_data = convert_to_opik_format(synthetic_df)

print(f"โœ… Converted {len(opik_data)} items to Opik format!")
print("\n๐Ÿ“‹ Sample converted item:")
import json
print(json.dumps(opik_data[0], indent=2))

๐Ÿ“ค Step 5: Push Dataset to Opik

Now let's upload our dataset to Opik where it can be used for model evaluation:

python
print("๐Ÿ“ค Pushing dataset to Opik...")

# Initialize Opik client
opik_client = opik.Opik()

# Create the dataset
dataset_name = "gretel-ai-qa-dataset"
dataset = opik_client.get_or_create_dataset(
    name=dataset_name,
    description="Synthetic Q&A dataset generated using Gretel Data Designer for AI/ML evaluation"
)

# Insert the data
dataset.insert(opik_data)

print(f"โœ… Successfully created dataset: {dataset.name}")
print(f"๐Ÿ†” Dataset ID: {dataset.id}")
print(f"๐Ÿ“Š Total items: {len(opik_data)}")

The trace can now be viewed in the UI:

โœ… Step 6: Verify Your Dataset

Let's confirm the dataset was created successfully and see how to use it:

python
print("๐Ÿ” Verifying dataset creation...")

# Try to retrieve the dataset
try:
    retrieved_dataset = opik_client.get_dataset(dataset_name)
    print(f"โœ… Dataset verified: {retrieved_dataset.name}")
    print(f"๐Ÿ†” Dataset ID: {retrieved_dataset.id}")
    
    print(f"\n๐ŸŽฏ Next steps:")
    print(f"1. Go to https://www.comet.com")
    print(f"2. Navigate to Opik โ†’ Datasets")
    print(f"3. Find your dataset: {dataset_name}")
    print(f"4. Use it to evaluate your AI models!")
    
except Exception as e:
    print(f"โŒ Could not verify dataset: {e}")
    print("Please check your Opik configuration and try again.")

๐Ÿงช Step 7: Example Model Evaluation

Here's how you can use your new dataset to evaluate a model with Opik:

python
# Example: Simple Q&A model evaluation
@opik.track
def simple_qa_model(input_data):
    """A simple example model that generates responses to questions"""
    question = input_data.get('question', '')
    
    # This is just an example - replace with your actual model
    if 'neural network' in question.lower():
        return "A neural network is a computational model inspired by biological neural networks."
    elif 'machine learning' in question.lower():
        return "Machine learning is a subset of AI that enables systems to learn from data."
    else:
        return "This is a complex AI/ML topic that requires detailed explanation."

print("๐Ÿงช Example model evaluation setup:")
print(f"Dataset: {dataset_name}")
print("Model: simple_qa_model (replace with your actual model)")
print("\n๐Ÿ’ก To run evaluation, uncomment and run the following code:")
print("\n๐ŸŽ‰ Integration complete! Your Gretel-generated dataset is ready for model evaluation in Opik.")

Congratulations! ๐ŸŽ‰ You've successfully:

  1. Generated synthetic Q&A data using Gretel Data Designer's advanced column types
  2. Converted the data to Opik's expected format
  3. Created a dataset in Opik for model evaluation
  4. Set up the foundation for AI model testing and optimization

The key advantage of using Gretel Data Designer is its modular approach - you can define exactly what data you want using samplers (for categories) and LLM columns (for generated text), giving you precise control over your synthetic dataset.


๐Ÿ”— Next Steps

  • View your dataset: Go to your Comet workspace โ†’ Opik โ†’ Datasets
  • Evaluate models: Use the dataset to test your Q&A models
  • Optimize prompts: Use Opik's Agent Optimizer with your synthetic data
  • Scale up: Generate larger datasets for more comprehensive testing

๐Ÿ“š Resources

Happy evaluating! ๐Ÿš€

๐Ÿ”„ Alternative: Using Gretel Safe Synthetics

If you have an existing Q&A dataset and want to create a synthetic version, you can use Gretel Safe Synthetics instead:

python
%%capture
%pip install -U gretel-client

Step A: Prepare Sample Data

python
import pandas as pd
from gretel_client.navigator_client import Gretel

# Initialize Gretel client
gretel = Gretel(api_key="prompt")

# Option 1: Use Gretel's sample ecommerce dataset (has 200+ records)
my_data_source = "https://gretel-datasets.s3.us-west-2.amazonaws.com/ecommerce_customers.csv"

# Option 2: Create your own Q&A dataset (needs 200+ records for holdout)
# For demonstration, we'll create a larger dataset
sample_questions = [
    'What is machine learning?',
    'How do neural networks work?',
    'What is the difference between AI and ML?',
    'Explain deep learning concepts',
    'What are the applications of NLP?'
] * 50  # Repeat to get 250 records

sample_answers = [
    'Machine learning is a subset of AI that enables systems to learn from data.',
    'Neural networks are computational models inspired by biological neural networks.',
    'AI is the broader concept while ML is a specific approach to achieve AI.',
    'Deep learning uses multi-layer neural networks to model complex patterns.',
    'NLP applications include chatbots, translation, sentiment analysis, and text generation.'
] * 50  # Repeat to get 250 records

sample_data = {
    'question': sample_questions,
    'answer': sample_answers,
    'topic': (['ML', 'Neural Networks', 'AI/ML', 'Deep Learning', 'NLP'] * 50),
    'difficulty': (['beginner', 'intermediate', 'beginner', 'advanced', 'intermediate'] * 50)
}

original_df = pd.DataFrame(sample_data)
print(f"๐Ÿ“„ Original dataset: {len(original_df)} records")
print(original_df.head())

# Important: Gretel requires at least 200 records to use holdout
if len(original_df) < 200:
    print("โš ๏ธ Warning: Dataset has less than 200 records. Holdout will be disabled.")

Step B: Generate Synthetic Version

python
# For quick demo with small dataset - disable holdout and transform
synthetic_dataset = gretel.safe_synthetic_dataset \
    .from_data_source(original_df, holdout=None) \
    .synthesize(num_records=5) \
    .create()

# Wait for completion and get results
synthetic_dataset.wait_until_done()
synthetic_df_safe = synthetic_dataset.dataset.df

print(f"โœ… Generated {len(synthetic_df_safe)} synthetic Q&A pairs using Safe Synthetics!")
print(synthetic_df_safe.head())

Step C: View Results and Quality Report

python
# Preview synthetic data
print("๐Ÿ” Synthetic dataset preview:")
print(synthetic_dataset.dataset.df.head())

# View quality report table
print("๐Ÿ“Š Quality Report Summary:")
print(synthetic_dataset.report.table)

# View detailed HTML report in notebook
# synthetic_dataset.report.display_in_notebook()

# Access workflow details
print("\n๐Ÿ”ง Workflow Configuration:")
print(synthetic_dataset.config_yaml)

# List all workflow steps
print("\n๐Ÿ“‹ Workflow Steps:")
for step in synthetic_dataset.steps:
    print(f"- {step.name}")

Step D: Convert to Opik and Upload

python
def convert_to_opik_format(df):
    """Convert Gretel Q&A data to Opik dataset format"""
    opik_items = []
    
    for _, row in df.iterrows():
        # Create Opik dataset item
        item = {
            "input": {
                "question": row["question"]
            },
            "expected_output": row["answer"],
            "metadata": {
                "topic": row.get("topic", "AI/ML"),
                "difficulty": row.get("difficulty", "unknown"),
                "source": "gretel_navigator"
            }
        }
        opik_items.append(item)
    
    return opik_items

# Initialize Opik client if not already defined
opik_client = opik.Opik()
# Convert and upload to Opik (same process as before)
opik_data_safe = convert_to_opik_format(synthetic_df_safe)

# Create dataset in Opik
dataset_safe = opik_client.get_or_create_dataset(
    name="gretel-safe-synthetics-qa-dataset",
    description="Synthetic Q&A dataset generated using Gretel Safe Synthetics"
)

dataset_safe.insert(opik_data_safe)
print(f"โœ… Safe Synthetics dataset created: {dataset_safe.name}")

The trace can now be viewed in the UI:

๐Ÿšจ Important: Dataset Size Requirements

Dataset SizeHoldout SettingExample
< 200 recordsholdout=Nonefrom_data_source(df, holdout=None)
200+ recordsDefault (5%) or customfrom_data_source(df) or from_data_source(df, holdout=0.1)
Large datasetsCustom percentage/countfrom_data_source(df, holdout=250)

๐Ÿค” When to Use Which Approach?

Use CaseRecommended ApproachWhy
Creating new datasets from scratchData DesignerMore control, custom column types, guided generation
Synthesizing existing datasetsSafe SyntheticsPreserves statistical relationships, privacy-safe
Custom data structuresData DesignerFlexible column definitions, template system
Production data replicationSafe SyntheticsMaintains data utility while ensuring privacy

Both approaches integrate seamlessly with Opik for model evaluation! ๐ŸŽฏ