tests_load/README.md
This folder contains two things:
tests/ — standalone CLI scripts for ad-hoc, one-off performance probes.suite/<target>/ — pytest-driven load-test suites that run on a schedule
in CI and can also be triggered manually. Each subdirectory targets a
different system under test (currently just python_sdk/); future
targets (TypeScript SDK, backend-only, etc.) would be peers.Install Opik locally using the docker-compose deployment (docs).
The Python SDK reads its configuration from environment variables or
~/.opik.config. For a local docker-compose install:
export OPIK_URL_OVERRIDE=http://localhost:5173/api/
suite/python_sdk/)The suite covers four ingestion shapes against a local Opik installation.
Defaults are sized for a scheduled (weekly) run — not for PR checks — so
each scenario produces meaningful load. Use --load-scale to dial them
down for local smoke runs.
| File / scenario | Default volume |
|---|---|
test_ingestion_rate.py::test_many_traces_one_span_each | 100k traces × 1 nested span, ~100 B payloads |
test_ingestion_rate.py::test_many_spans_per_trace | 5k traces × 50 spans = 250k spans, ~100 B payloads |
test_heavy_payload.py::test_traces_with_one_megabyte_payload | 500 traces × (1 MB in + 1 MB out) ≈ 1 GB |
test_heavy_payload.py::test_spans_with_heavy_payload | 200 traces × 5 spans × (500 KB in + 500 KB out) ≈ 1 GB |
test_attachments.py::test_traces_with_explicit_attachments | 500 traces × 2 × 50 KB attachments ≈ 50 MB, 1k uploads |
test_attachments.py::test_traces_with_implicit_attachments | 500 traces × 400 KB base64 blob extracted as attachment |
test_bursts.py::test_burst_single_loop | 50k traces, tight loop, no think-time |
test_bursts.py::test_spread_over_time | 10k traces evenly spread over 10 minutes |
test_bursts.py::test_concurrent_writers_share_one_client | 30 threads × 1k traces = 30k traces sharing one client |
test_dataset_items.py::test_dataset_insert_many_versions | 50 sequential Dataset.insert() calls × 50 items × ~4 KB payload = 2.5k items across 50 versions. Verifies dataset.get_items() round-trips the full count — catches the multi-replica ClickHouse COPY_VERSION_ITEMS short-read truncation that drops items on prod (won't reproduce against single-replica localhost) |
Every test:
opik.flush_tracker().search_traces / search_spans / attachments.attachment_list
until the expected number of items is visible, so the run only passes if
every logged item lands.logging, flush, verify, total) to
tests_load/.last_run/<test_name>.json.# Install the Opik SDK (from this repo, or `pip install opik` for released)
pip install -e sdks/python
# Install suite-specific deps
pip install -r tests_load/suite/python_sdk/requirements.txt
# Point the SDK at whichever Opik install you want to hit
export OPIK_URL_OVERRIDE=http://localhost:5173/api/ # full local stack
# export OPIK_URL_OVERRIDE=http://localhost:8080 # backend-only (./opik.sh --backend)
# export OPIK_URL_OVERRIDE=https://www.comet.com/opik/api/ OPIK_API_KEY=... OPIK_WORKSPACE=...
The suite is environment-agnostic — it reads whatever OPIK_* variables are set in the shell. Configuring those is the caller's responsibility.
cd tests_load
pytest suite/python_sdk # serial
pytest suite/python_sdk -n auto --dist=worksteal # parallel via pytest-xdist
The scheduled workflow runs with -n 2 --dist=worksteal — -n auto
(4 workers on ubuntu-latest) was reliably OOM-killing the highest-
volume ingestion-rate scenarios when they coincided with other heavy
tests against the same docker-compose Opik stack on the 7 GB runner.
Each
scenario uses a unique project name so worker isolation holds; the
shared backend will see meaningful concurrent load, which is itself
useful coverage.
Every scenario accepts a scale multiplier. Use it for quick smoke runs or heavy bake-offs:
# Quick smoke (~10% of default volumes)
pytest suite/python_sdk --load-scale 0.1
# Heavy run (5x default)
OPIK_LOAD_SCALE=5 pytest suite/python_sdk
The GitHub Actions workflow .github/workflows/load_tests.yml:
workflow_dispatch..last_run/ as a build artifact.tests/)These are kept for ad-hoc experiments; they are not picked up by pytest.
Their dependencies are pinned in requirements.txt.