@kbn/es-snapshot-loader

Load Elasticsearch snapshots for testing environments. Provides three operations:

create - Create a snapshot in a writable Elasticsearch repository (gcs or fs)
restore - Basic snapshot restore directly to Elasticsearch
replay - Restore with timestamp transformation for data streams, making historical data appear fresh

Repository Types

@kbn/es-snapshot-loader supports three repository strategies:

url (default): read-only URL repositories backed by file:// paths
gcs: Google Cloud Storage repositories
fs: File system repositories backed by a shared/local filesystem path

Repository support matrix:

Repository Type	Restore/Replay	Create
`url`	Yes	No (read-only in ES)
`gcs`	Yes	Yes
`fs`	Yes	Yes

URL Repository (`file://`)

For URL repositories, Elasticsearch must be started with path.repo configured to allow URL-based repository registration.

When starting Elasticsearch for development, configure snapshot repository path:

bash

yarn es snapshot --E path.repo="/tmp/es-snapshots"

GCS Repository

For GCS repositories, Elasticsearch must be configured with GCS credentials in the keystore using a credentials file setting such as gcs.client.default.credentials_file.

With the Scout evals_tracing server config, this is handled automatically when GCS_CREDENTIALS is set. The config writes the env var value to a temp file and passes gcs.client.default.credentials_file=<temp-file> so kbn-es routes it through keystore add-file.

FS Repository

For FS repositories, Elasticsearch must have path.repo configured in elasticsearch.yml, and the configured repository location must be under one of the allowed path.repo directories.

Creating Snapshots

Before using restore or replay, you need a snapshot. This section covers how to create snapshots compatible with @kbn/es-snapshot-loader.

1. Register a Snapshot Repository

First, register a file system repository. The location must be within the path.repo directory configured when starting Elasticsearch:

bash

curl -X PUT "http://elastic:changeme@localhost:9200/_snapshot/my-repository" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "fs",
    "settings": {
      "location": "/tmp/es-backups"
    }
  }'

2. Create the Snapshot

Create a snapshot with only the indices you need. Avoid snapshotting global state or all indices unless your dataset requires it:

bash

curl -X PUT "http://elastic:changeme@localhost:9200/_snapshot/my-repository/snapshot_1?wait_for_completion=true" \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "logs-*,metrics-*,traces-*",
    "include_global_state": false
  }'

How This Fits with Restore/Replay

Once you have a snapshot:

For restore: Use --snapshot-url file:///tmp/es-snapshots/my-repository pointing to your repository directory. The loader will register this as a read-only URL repository and restore indices directly.
For replay: Same as restore, but replay is designed for data streams (logs-*, metrics-*, traces-*). It transforms timestamps so your historical data appears fresh/useful for testing timestamp-aware features.

The snapshot repository path you used when creating the snapshot becomes the --snapshot-url (as a file:// URL) when restoring or replaying.

CLI Usage

Create

Create a snapshot in a writable repository:

bash

# Create to GCS using API key auth
node scripts/es_snapshot_loader create \
  --repo-type gcs \
  --gcs-bucket my-snapshots \
  --gcs-base-path evals/2026-03 \
  --snapshot-name my-snapshot-2026-03-02 \
  --es-url https://my-cluster.elastic.cloud:443 \
  --es-api-key <base64-encoded-api-key> \
  --indices "logs-*,metrics-*"

# Create to a local/shared filesystem repository
node scripts/es_snapshot_loader create \
  --repo-type fs \
  --fs-location /mount/backups/my-snapshot \
  --snapshot-name local-snapshot-001 \
  --es-url http://elastic:changeme@localhost:9200

Restore

Restore a snapshot directly to Elasticsearch:

bash

node scripts/es_snapshot_loader restore \
  --snapshot-url file:///path/to/snapshot \
  --es-url http://elastic:changeme@localhost:9200

# Restore from GCS repository
node scripts/es_snapshot_loader restore \
  --repo-type gcs \
  --gcs-bucket obs-ai-datasets \
  --gcs-base-path otel-demo/payment-service-failures \
  --es-url http://elastic:changeme@localhost:9200

# Restore a specific snapshot by name
node scripts/es_snapshot_loader restore \
  --snapshot-url file:///path/to/snapshot \
  --snapshot-name my-snapshot-2025-12-01 \
  --es-url http://elastic:changeme@localhost:9200

# With index filtering
node scripts/es_snapshot_loader restore \
  --snapshot-url file:///path/to/snapshot \
  --es-url http://elastic:changeme@localhost:9200 \
  --indices "my-index-*,other-index-*"

# Restore into temporary indices
node scripts/es_snapshot_loader restore \
  --repo-type gcs \
  --gcs-bucket obs-ai-datasets \
  --gcs-base-path otel-demo/snapshots \
  --es-url http://elastic:changeme@localhost:9200 \
  --indices "my-features-*" \
  --rename-pattern "(.+)" \
  --rename-replacement "tmp-$1"

# Restore with allow-no-matches (succeed even if no indices match)
node scripts/es_snapshot_loader restore \
  --snapshot-url file:///path/to/snapshot \
  --es-url http://elastic:changeme@localhost:9200 \
  --indices "optional-index-*" \
  --allow-no-matches

Replay

Restore data streams with timestamp transformation. The most recent record in the snapshot will appear as "now", with all other timestamps adjusted relative to it:

bash

node scripts/es_snapshot_loader replay \
  --snapshot-url file:///path/to/snapshot \
  --es-url http://elastic:changeme@localhost:9200 \
  --patterns "logs-*,metrics-*,traces-*"

# Replay from GCS repository
node scripts/es_snapshot_loader replay \
  --repo-type gcs \
  --gcs-bucket obs-ai-datasets \
  --gcs-base-path otel-demo/payment-service-failures \
  --es-url http://elastic:changeme@localhost:9200 \
  --patterns "logs-*,metrics-*,traces-*"

# Replay a specific snapshot by name
node scripts/es_snapshot_loader replay \
  --snapshot-url file:///path/to/snapshot \
  --snapshot-name my-snapshot-2025-12-01 \
  --es-url http://elastic:changeme@localhost:9200 \
  --patterns "logs-*,metrics-*,traces-*"

# With custom data stream patterns
node scripts/es_snapshot_loader replay \
  --snapshot-url file:///path/to/snapshot \
  --kibana-url http://localhost:5601 \
  --patterns "logs-*,metrics-*,traces-*"

Common Options

Flag	Description
`--repo-type`	Repository type (`url`, `gcs`, or `fs`; default: `url`)
`--snapshot-url`	URL snapshot directory for `url` repositories (`file://...`)
`--gcs-bucket`	GCS bucket name (required when using `gcs`)
`--gcs-base-path`	Optional base path in the GCS bucket
`--gcs-client`	Optional Elasticsearch GCS client name
`--fs-location`	FS repository location (required when using `fs`)
`--fs-compress`	Enable compression for FS repository snapshots
`--snapshot-name`	Snapshot name to restore/replay (default: latest SUCCESS snapshot in the repository)
`--es-url`	Elasticsearch URL with credentials (e.g., `http://elastic:changeme@localhost:9200`)
`--es-api-key`	Base64-encoded Elasticsearch API key. Overrides credentials embedded in `--es-url`
`--kibana-url`	Kibana URL for ES requests proxied through Kibana (e.g., `http://localhost:5601`)

Notes:

--es-api-key can be used with create, restore, and replay.
Auth precedence is --es-api-key > credentials in --es-url > no auth.
--snapshot-url, --gcs-*, and --fs-* repository flags are mutually exclusive.
--repo-type url requires --snapshot-url.
--repo-type gcs requires --gcs-bucket.
--repo-type fs requires --fs-location.

Restore-specific Options

Flag	Description
`--indices`	Comma-separated index patterns to restore
`--rename-pattern`	Regex applied to index names during restore (ES `rename_pattern`). Must pair with `--rename-replacement`
`--rename-replacement`	Replacement string for renamed indices (ES `rename_replacement`). Must pair with `--rename-pattern`
`--allow-no-matches`	When set, a restore that matches no indices succeeds silently instead of throwing an error

Replay-specific Options

Flag	Description
`--patterns`	Comma-separated data stream patterns to replay (required)
`--concurrency`	Number of indices to reindex in parallel (default: all at once)

Programmatic API

Basic Restore

typescript

import { createUrlRepository, restoreSnapshot } from '@kbn/es-snapshot-loader';

const result = await restoreSnapshot({
  esClient,
  log,
  repository: createUrlRepository('file:///path/to/snapshot'),
  snapshotName: 'my-snapshot-2025-12-01',
  indices: ['my-index-*'],
});

if (result.success) {
  console.log(`Restored ${result.restoredIndices.length} indices`);
}

Restore with Rename and Allow No Matches

typescript

import { createGcsRepository, restoreSnapshot } from '@kbn/es-snapshot-loader';

const result = await restoreSnapshot({
  esClient,
  log,
  repository: createGcsRepository({ bucket: 'my-bucket', basePath: 'snapshots' }),
  snapshotName: 'my-snapshot',
  indices: ['features-*'],
  renamePattern: '(.+)',
  renameReplacement: 'tmp-$1',
  allowNoMatches: true,
});

if (result.success) {
  console.log(`Restored ${result.restoredIndices.length} indices to temp location`);
}

Replay with Timestamp Transformation

typescript

import { createUrlRepository, replaySnapshot } from '@kbn/es-snapshot-loader';

const result = await replaySnapshot({
  esClient,
  log,
  repository: createUrlRepository('file:///path/to/snapshot'),
  snapshotName: 'my-snapshot-2025-12-01',
  patterns: ['logs-*', 'metrics-*', 'traces-*'],
  concurrency: 5, // optional: limit parallel reindex operations
});

if (result.success) {
  console.log(`Reindexed ${result.reindexedIndices?.length} indices`);
  console.log(`Max timestamp: ${result.maxTimestamp}`);
}

Replay from GCS Programmatically

typescript

import { createGcsRepository, replaySnapshot } from '@kbn/es-snapshot-loader';

const result = await replaySnapshot({
  esClient,
  log,
  repository: createGcsRepository({
    bucket: 'obs-ai-datasets',
    basePath: 'otel-demo/payment-service-failures',
  }),
  snapshotName: 'my-snapshot-2025-12-01',
  patterns: ['logs-*', 'metrics-*', 'traces-*'],
});

Using in Test Hooks

typescript

import { Client } from '@elastic/elasticsearch';
import { createUrlRepository, replaySnapshot } from '@kbn/es-snapshot-loader';

describe('my test suite', () => {
  beforeAll(async () => {
    const esClient = new Client({
      node: 'http://localhost:9200',
      auth: { username: 'elastic', password: 'changeme' },
    });

    await replaySnapshot({
      esClient,
      log: console, // or your test logger
      repository: createUrlRepository('file:///fixtures/otel-demo-snapshot'),
      snapshotName: 'otel-demo-snapshot-2025-12-01',
      patterns: ['logs-*', 'metrics-*', 'traces-*'],
    });
  });

  // ... tests that use the replayed data
});

Prerequisites

For Restore

URL repositories:
- Elasticsearch must have path.repo configured to allow URL-based repository registration
- The snapshot must be accessible at the specified file:// URL
GCS repositories:
- Elasticsearch must be configured with GCS credentials in keystore (for example gcs.client.default.credentials_file)
- The configured GCS client must be able to access the snapshot bucket/base path
FS repositories:
- Elasticsearch must have path.repo configured in elasticsearch.yml
- The configured --fs-location must be included under the allowed path.repo paths

For Create

GCS repositories:
- Elasticsearch must be configured with GCS credentials in keystore (for example gcs.client.default.credentials_file)
- The configured GCS client must have write access to the target bucket/base path
FS repositories:
- Elasticsearch must have path.repo configured in elasticsearch.yml
- The configured --fs-location must be included under the allowed path.repo paths
URL repositories are not supported for create because URL repositories are read-only in Elasticsearch

For Replay

All prerequisites for restore, plus:
Index templates for the target data streams must exist (pre-create, install Fleet or required integrations in Kibana)
Without these templates, replay may create regular indices instead of data streams
To check templates:
- GET _index_template/*?filter_path=index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream

How Replay Works

Register the configured snapshot repository (URL or GCS)
Retrieve snapshot metadata (uses --snapshot-name if provided; otherwise picks the latest SUCCESS snapshot)
Restore indices to temporary locations (prefixed with snapshot-loader-temp-)
Query restored indices to find the latest @timestamp in the data
Create an ingest pipeline that transforms @timestamp fields:
- The latest timestamp from the data becomes "now"
- All other timestamps are adjusted by the same offset, preserving relative timing
Reindex through the pipeline to the target data streams
Clean up temporary indices, pipeline, and repository

@kbn/es-snapshot-loader

@kbn/es-snapshot-loader

Repository Types

URL Repository (file://)

GCS Repository

FS Repository

Creating Snapshots

1. Register a Snapshot Repository

2. Create the Snapshot

How This Fits with Restore/Replay

CLI Usage

Create

Restore

Replay

Common Options

Restore-specific Options

Replay-specific Options

Programmatic API

Basic Restore

Restore with Rename and Allow No Matches

Replay with Timestamp Transformation

Replay from GCS Programmatically

Using in Test Hooks

Prerequisites

For Restore

For Create

For Replay

How Replay Works

URL Repository (`file://`)