x-pack/platform/packages/shared/kbn-evals-suite-significant-events/scripts/README.md
This folder contains a small CLI + helper modules for producing repeatable Elasticsearch snapshots that can be used as datasets for Streams Significant Events evaluation and experimentation.
The snapshots are written to GCS (bucket: significant-events-datasets) under a run-specific base path, and each scenario snapshot contains:
logs* - Demo app log data generated during the runsigevents-streams-features-<scenario> - extracted features produced by Streams feature extraction (copied out of system indices so it can be snapshotted)For each selected scenario, the capture_otel_demo_snapshots script:
--baseline-wait)--failure-wait)logs streamThe per-scenario deploy/teardown + data cleanup is intentional as each snapshot should be isolated and not contaminated by previous scenarios.
--cpus=4 --memory=8g. Ensure your system can spare these resources before running the script.yarn es snapshot --license trial --secure-files gcs.client.default.credentials_file=/path/to/creds.jsonkibana.dev.yml for feature extraction| Flag | Description | Default |
|---|---|---|
--connector-id | Required. LLM connector ID used for feature extraction. | none |
--run-id | Run identifier used for the snapshot repo name and GCS base path. | Today's date YYYY-MM-DD (local time) |
--scenario | Limit to specific scenario(s). Can be repeated. Omit to run all scenarios for the selected demo app. | All scenarios |
--demo-app | Demo app to use. Must be a registered demo type (see listAvailableDemos()). | otel-demo |
--baseline-wait | Duration to wait for baseline traffic. Accepts s, m, h, d suffixes (e.g. 3m, 90s, 1h). | 3m |
--failure-wait | Duration to wait after applying a failure scenario. Same format as --baseline-wait. | 5m |
--dry-run | Print what would happen without deploying, extracting, or snapshotting. | false |
--es-url | Elasticsearch URL | from config/kibana.dev.yml |
--es-username | Elasticsearch username | from config/kibana.dev.yml |
--es-password | Elasticsearch password | from config/kibana.dev.yml |
--kibana-url | Kibana base URL (if omitted, built from Kibana config with basePath resolution) | from config/kibana.dev.yml |
node scripts/capture_sigevents_otel_demo_snapshots.js --connector-id <connectorId>
node scripts/capture_sigevents_otel_demo_snapshots.js \
--connector-id <connectorId> \
--demo-app online-boutique
node scripts/capture_sigevents_otel_demo_snapshots.js \
--connector-id <connectorId> \
--scenario healthy-baseline \
--scenario payment-unreachable
What --run-id does: It namespaces where snapshots are stored in GCS and what the ES snapshot repository is called.
gs://significant-events-datasets/<run-id>/<demo-app>/sigevents-<run-id>Default: If omitted, --run-id defaults to today's date in YYYY-MM-DD format (local time).
Note: Use a unique run ID (e.g.: test-run-...) to keep multiple runs side-by-side without colliding with previous snapshots.
node scripts/capture_sigevents_otel_demo_snapshots.js \
--connector-id <connectorId> \
--run-id test-run-2026-02-23
node scripts/capture_sigevents_otel_demo_snapshots.js \
--connector-id <connectorId> \
--baseline-wait 5m \
--failure-wait 15m
At the end of a run you should see:
Those snapshot names are what you will replay later in the eval setup.
This script is designed for disposable dev data and assumes it owns the logs* space.
logs* and .kibana_streams_features, and it disables Streams. Do not point this at a shared cluster with important logs.indices parameter, so we copy extracted features into sigevents-streams-features-<scenario> before snapshotting.from = now - 24h), so any unrelated logs* data in that time range can affect extracted features and snapshots.--run-id: It determines the snapshot repository name (sigevents-<run-id>) and the GCS base path. Reusing a run ID can collide with existing snapshots (same scenario snapshot names).--scenario first; use --dry-run to verify what will be created.node scripts/otel_demo.js --teardown) and/or clean up remaining logs* / streams state before retrying.