programs/keeper-bench/README.md
keeper-bench benchmarks ClickHouse Keeper (or any ZooKeeper-compatible service) in two modes:
generator section).--input-request-log).Replace placeholders in the commands below with paths in your environment.
# Generated workload from config
clickhouse keeper-bench \
--config <config_file>
# Replay workload from a request log
clickhouse keeper-bench \
-h localhost:9181 \
--input-request-log <request_log_file>
An example config is available at programs/keeper-bench/example.yaml.
| Flag | Short | Description |
|---|---|---|
--help | Print help and exit | |
--config | YAML/XML config file | |
--input-request-log | Replay requests from a request log file | |
--setup-nodes-snapshot-path | Directory containing Keeper snapshots used to build initial node state for replay | |
--concurrency | -c | Number of parallel worker threads |
--report-delay | -d | Delay between periodic reports in seconds (0 disables periodic reports) |
--iterations | -i | Total number of requests to execute (0 means unlimited) |
--time-limit | -t | Stop producing new requests after this many seconds |
--hosts | -h | Host list, e.g. -h host1:9181 host2:9181 |
--continue_on_errors | Continue running after request exceptions |
Rules:
--config or --input-request-log must be provided.--hosts or from connections in config.generator, execution uses replay mode, so --input-request-log must be provided.generator)Use when you want synthetic workload generation.
Required:
generator.requests in config.--hosts or connections.Optional:
setup for creating initial Keeper tree.output for JSON output.--input-request-log)Use when you want to replay previously recorded Keeper traffic.
Required:
--input-request-log.--hosts or connections.Optional:
--setup-nodes-snapshot-path to build/update initial snapshot state inferred from expected replay outcomes.Config can be YAML or XML.
| Key | Type | Default | Notes |
|---|---|---|---|
concurrency | integer | 1 | Worker threads |
iterations | integer | 0 | 0 means unlimited |
report_delay | float | 1.0 | Seconds between periodic reports; 0 disables |
timelimit | float | 0 | Seconds to stop producing new requests; 0 disables |
continue_on_error | bool | false | Continue after request exceptions |
queue_depth | integer | 1 | Producer queue depth per thread (>= 1) |
pipeline_depth | integer | 1 | In-flight async requests per worker (>= 1) |
warmup_seconds | float | 0 | Measurement warmup window |
enable_tracing | bool | false | Attach OpenTelemetry trace context |
connections | object | required if --hosts absent | Keeper endpoints and connection settings |
setup | object | optional | Data tree created before run |
generator | object | optional | Request generator (enables generated mode) |
output | object | optional | JSON output controls |
These reusable types are used in setup and generator sections.
IntegerGetterA constant integer, or a random value drawn uniformly from a range on each use.
# constant
key: 42
# random from [10, 20]
key:
min_value: 10
max_value: 20
StringGetterA constant string, or a random string whose length is an IntegerGetter.
# constant
key: "hello"
# random string with length drawn from [10, 20]
key:
random_string:
size:
min_value: 10
max_value: 20
PathGetterOne or more ZooKeeper paths. Paths can be explicit or expanded from children of a parent.
# explicit paths
path:
- "/path1"
- "/path2"
# children of a parent node
path:
children_of: "/path3"
Both forms can be used together and merged into one candidate set.
Notes:
/.children_of is resolved at startup; if it has no children and no explicit paths are provided, an exception is raised.path keys in one section are supported when parsed by ClickHouse config loader (Poco-style key indexing).# number of worker threads (default: 1)
concurrency: 20
# total requests to execute; 0 = unlimited (default: 0)
iterations: 10000
# periodic report interval in seconds; 0 = disable (default: 1.0)
report_delay: 4
# stop producing new requests after this many seconds; 0 = no limit (default: 0)
timelimit: 300
# continue on request exceptions (default: false)
continue_on_error: true
# producer queue capacity multiplier per worker; must be >= 1 (default: 1)
queue_depth: 4
# max in-flight requests per worker; must be >= 1 (default: 1)
pipeline_depth: 8
# ignore stats during first N seconds, then reset counters (default: 0)
warmup_seconds: 5
# attach OpenTelemetry tracing context to requests (default: false)
enable_tracing: false
Connections are defined under top-level connections.
connections are defaults.connections.host and connections.connection entries define concrete endpoints.sessions.secure: boolean # use TLS (default: false)
operation_timeout_ms: integer # operation timeout (default: Keeper client default)
session_timeout_ms: integer # session timeout (default: Keeper client default)
connection_timeout_ms: integer # connect timeout (default: Keeper client default)
use_compression: boolean # protocol compression (default: false)
use_xid_64: boolean # use 64-bit xid (default: false)
sessions: integer # sessions per endpoint (default: 1)
connections:
# defaults for all endpoints
secure: true
operation_timeout_ms: 3000
# one session
host: "localhost:9181"
# two sessions with per-endpoint overrides
connection:
host: "localhost:9182"
sessions: 2
operation_timeout_ms: 2000
session_timeout_ms: 2000
setup defines nodes created before benchmarking.
Important behavior:
repeat requires random name; otherwise an exception is raised.Structure:
node:
name: StringGetter
data: StringGetter # optional
repeat: integer # optional, requires random name
node: ... # nested children
Example:
setup:
node:
name: "node1"
node:
repeat: 4
name:
random_string:
size: 20
data: "payload"
node:
name:
random_string:
size: 10
repeat: 2
generator controls synthetic workload generation.
generator:
# fixed seed for reproducibility; if omitted, random seed is generated and printed
seed: 12345
Requests are defined in generator.requests.
create, set, get, list, multi.weight (default 1, must be >= 1).multi is not allowed.createcreate:
path: "/bench/creates" # PathGetter
name_length: 10 # IntegerGetter, default: 5
data: "payload" # StringGetter, default: empty
remove_factor: 0.5 # in [0.0, 1.0], default: 0
When remove_factor is enabled, some operations become random removes of previously created nodes.
If unique name generation keeps colliding, the generator raises an exception after bounded retries.
setset:
path: PathGetter
data: StringGetter
getget:
path: PathGetter
listlist:
path: PathGetter
multimulti:
size: IntegerGetter # optional
# nested request generators (`create`/`set`/`get`/`list`) with optional `weight`
Behavior:
size is set, that many subrequests are sampled.size is omitted, one subrequest from each nested generator is included.generator:
seed: 42
requests:
create:
path: "/test_create"
name_length:
min_value: 10
max_value: 20
remove_factor: 0.5
multi:
weight: 20
size: 10
get:
path:
children_of: "/test_get1"
get:
weight: 2
path:
children_of: "/test_get2"
In this example, multi is selected about 20 times more often than create.
Inside multi, gets from /test_get2 are selected about 2 times more often than gets from /test_get1.
Replay mode reads requests from --input-request-log.
Behavior details:
Supported operation kinds in logs include Create, Set, Remove, Check, CheckNotExists, Sync, Get, List, Exists, Multi, and MultiRead.
Expected log columns:
hostnamerequest_event_timethread_idsession_idxidhas_watchop_numpathdatais_ephemeralis_sequentialresponse_event_timeerrorrequests_sizeversionIf --setup-nodes-snapshot-path is provided during replay, the tool can infer required initial nodes from expected outcomes and write an updated snapshot.
Periodic stderr reports (controlled by report_delay) include:
0, 10, ..., 90, 95, 99, 99.9, 99.99).Configure JSON output with:
output:
file: "output.json"
# or:
file:
path: "output.json"
with_timestamp: true
stdout: true
JSON fields:
timestamp (epoch milliseconds).read_results (present only if read requests exist).write_results (present only if write requests exist).per_op_results (present only if per-op stats exist).Each result object contains:
total_requestsrequests_per_secondbytes_per_secondpercentiles (array of { "<percent>": <latency_ms> } objects)Common configuration exceptions:
No config file or hosts defined: provide --hosts or connections.Both --config and --input_request_log cannot be empty: provide at least one mode input.queue_depth must be >= 1 / pipeline_depth must be >= 1: set both to positive values.Invalid path for request generator: all paths must start with /.PathGetter has no paths after initialization: children_of parent has no children and no explicit path entries were supplied.Generator weight must be >= 1: use positive weights only.remove_factor must be in [0.0, 1.0]: keep probability in range.Nested multi requests are not allowed: only one multi level is supported.Repeating node creation ..., but name is not randomly generated: use random name when repeat is set.