Back to Promptfoo

integration-e2b (E2B Code Evaluation)

examples/integration-e2b/README.md

0.121.91.7 KB
Original Source

integration-e2b (E2B Code Evaluation)

What This Example Demonstrates

This example shows a complete prompt→LLM→sandboxed-execution→metric pipeline using:

  • promptfoo to run LLM prompts and manage evaluation cases.
  • An LLM provider to generate Python functions from a short problem prompt.
  • e2b sandboxes (via e2b-code-interpreter) to run generated code safely.
  • OpenAI step to generate small verification unit tests and re-run them in the sandbox.
  • Per-run JSON metrics written to .promptfoo_results/ and a human-friendly markdown report produced by report.py.

You can run this example with:

bash
npx promptfoo@latest init --example integration-e2b
cd integration-e2b

Environment Variables

Set these in your shell before running the example.

bash
# Required
export E2B_API_KEY="e2b_xxx_your_key_here"        # e2b sandbox API key
export OPENAI_API_KEY="sk_xxx_your_key_here"     # OpenAI key (or your chosen LLM provider)

# Recommended
export PROMPTFOO_PYTHON="$(pwd)/.venv/bin/python"  # tell promptfoo which Python/venv to use
  • If you use a different provider name in promptfooconfig.yaml, add that provider's key instead.

Prerequisites

Install and prepare a Python virtual environment, and install the required packages.

bash
# create & activate venv
python -m venv .venv
source .venv/bin/activate

# install Python packages
pip install --upgrade pip
pip install e2b-code-interpreter
npm i -g promptfoo

Running the Example

Activate venv and ensure env vars are set:

bash
source .venv/bin/activate
export E2B_API_KEY="e2b_xxx"
export OPENAI_API_KEY="sk_xxx"
export PROMPTFOO_PYTHON="$(pwd)/.venv/bin/python"

Run the evaluation:

bash
promptfoo eval

Open the interactive viewer:

bash
promptfoo view