Back to Clickhouse

Keeper Bench

programs/keeper-bench/README.md

26.4.1.1-new12.1 KB
Original Source

Keeper Bench

keeper-bench benchmarks ClickHouse Keeper (or any ZooKeeper-compatible service) in two modes:

  • Generate requests from a workload config (generator section).
  • Replay requests from a recorded request log (--input-request-log).

Quick start

Replace placeholders in the commands below with paths in your environment.

bash
# Generated workload from config
clickhouse keeper-bench \
    --config <config_file>

# Replay workload from a request log
clickhouse keeper-bench \
    -h localhost:9181 \
    --input-request-log <request_log_file>

An example config is available at programs/keeper-bench/example.yaml.

Command-line options

FlagShortDescription
--helpPrint help and exit
--configYAML/XML config file
--input-request-logReplay requests from a request log file
--setup-nodes-snapshot-pathDirectory containing Keeper snapshots used to build initial node state for replay
--concurrency-cNumber of parallel worker threads
--report-delay-dDelay between periodic reports in seconds (0 disables periodic reports)
--iterations-iTotal number of requests to execute (0 means unlimited)
--time-limit-tStop producing new requests after this many seconds
--hosts-hHost list, e.g. -h host1:9181 host2:9181
--continue_on_errorsContinue running after request exceptions

Rules:

  • --config or --input-request-log must be provided.
  • Command-line values override config values for overlapping fields.
  • Hosts can come from --hosts or from connections in config.
  • If config does not define generator, execution uses replay mode, so --input-request-log must be provided.

Modes

Generated mode (generator)

Use when you want synthetic workload generation.

Required:

  • generator.requests in config.
  • At least one host from --hosts or connections.

Optional:

  • setup for creating initial Keeper tree.
  • output for JSON output.

Replay mode (--input-request-log)

Use when you want to replay previously recorded Keeper traffic.

Required:

  • --input-request-log.
  • At least one host from --hosts or connections.

Optional:

  • --setup-nodes-snapshot-path to build/update initial snapshot state inferred from expected replay outcomes.

Configuration file

Config can be YAML or XML.

Table of contents


Top-level keys

KeyTypeDefaultNotes
concurrencyinteger1Worker threads
iterationsinteger00 means unlimited
report_delayfloat1.0Seconds between periodic reports; 0 disables
timelimitfloat0Seconds to stop producing new requests; 0 disables
continue_on_errorboolfalseContinue after request exceptions
queue_depthinteger1Producer queue depth per thread (>= 1)
pipeline_depthinteger1In-flight async requests per worker (>= 1)
warmup_secondsfloat0Measurement warmup window
enable_tracingboolfalseAttach OpenTelemetry trace context
connectionsobjectrequired if --hosts absentKeeper endpoints and connection settings
setupobjectoptionalData tree created before run
generatorobjectoptionalRequest generator (enables generated mode)
outputobjectoptionalJSON output controls

Special types

These reusable types are used in setup and generator sections.

IntegerGetter

A constant integer, or a random value drawn uniformly from a range on each use.

yaml
# constant
key: 42

# random from [10, 20]
key:
    min_value: 10
    max_value: 20

StringGetter

A constant string, or a random string whose length is an IntegerGetter.

yaml
# constant
key: "hello"

# random string with length drawn from [10, 20]
key:
    random_string:
        size:
            min_value: 10
            max_value: 20

PathGetter

One or more ZooKeeper paths. Paths can be explicit or expanded from children of a parent.

yaml
# explicit paths
path:
    - "/path1"
    - "/path2"

# children of a parent node
path:
    children_of: "/path3"

Both forms can be used together and merged into one candidate set.

Notes:

  • Paths must start with /.
  • children_of is resolved at startup; if it has no children and no explicit paths are provided, an exception is raised.
  • Duplicate path keys in one section are supported when parsed by ClickHouse config loader (Poco-style key indexing).

General settings

yaml
# number of worker threads (default: 1)
concurrency: 20

# total requests to execute; 0 = unlimited (default: 0)
iterations: 10000

# periodic report interval in seconds; 0 = disable (default: 1.0)
report_delay: 4

# stop producing new requests after this many seconds; 0 = no limit (default: 0)
timelimit: 300

# continue on request exceptions (default: false)
continue_on_error: true

# producer queue capacity multiplier per worker; must be >= 1 (default: 1)
queue_depth: 4

# max in-flight requests per worker; must be >= 1 (default: 1)
pipeline_depth: 8

# ignore stats during first N seconds, then reset counters (default: 0)
warmup_seconds: 5

# attach OpenTelemetry tracing context to requests (default: false)
enable_tracing: false

Connections

Connections are defined under top-level connections.

  • Values directly under connections are defaults.
  • connections.host and connections.connection entries define concrete endpoints.
  • Each endpoint can open multiple sessions via sessions.

Per-connection settings

yaml
secure: boolean                  # use TLS (default: false)
operation_timeout_ms: integer    # operation timeout (default: Keeper client default)
session_timeout_ms: integer      # session timeout (default: Keeper client default)
connection_timeout_ms: integer   # connect timeout (default: Keeper client default)
use_compression: boolean         # protocol compression (default: false)
use_xid_64: boolean              # use 64-bit xid (default: false)
sessions: integer                # sessions per endpoint (default: 1)

Example

yaml
connections:
    # defaults for all endpoints
    secure: true
    operation_timeout_ms: 3000

    # one session
    host: "localhost:9181"

    # two sessions with per-endpoint overrides
    connection:
        host: "localhost:9182"
        sessions: 2
        operation_timeout_ms: 2000
        session_timeout_ms: 2000

Setup

setup defines nodes created before benchmarking.

Important behavior:

  • Before setup creation, the tool recursively removes each configured root node path.
  • On normal shutdown, it attempts to clean these root paths again.
  • repeat requires random name; otherwise an exception is raised.

Structure:

yaml
node:
    name: StringGetter
    data: StringGetter           # optional
    repeat: integer              # optional, requires random name
    node: ...                    # nested children

Example:

yaml
setup:
    node:
        name: "node1"
        node:
            repeat: 4
            name:
                random_string:
                    size: 20
            data: "payload"

    node:
        name:
            random_string:
                size: 10
        repeat: 2

Generator

generator controls synthetic workload generation.

yaml
generator:
    # fixed seed for reproducibility; if omitted, random seed is generated and printed
    seed: 12345

Requests are defined in generator.requests.

  • Supported request types: create, set, get, list, multi.
  • Each request entry can define weight (default 1, must be >= 1).
  • Nested multi is not allowed.

create

yaml
create:
    path: "/bench/creates"           # PathGetter
    name_length: 10                    # IntegerGetter, default: 5
    data: "payload"                   # StringGetter, default: empty
    remove_factor: 0.5                 # in [0.0, 1.0], default: 0

When remove_factor is enabled, some operations become random removes of previously created nodes. If unique name generation keeps colliding, the generator raises an exception after bounded retries.

set

yaml
set:
    path: PathGetter
    data: StringGetter

get

yaml
get:
    path: PathGetter

list

yaml
list:
    path: PathGetter

multi

yaml
multi:
    size: IntegerGetter              # optional
    # nested request generators (`create`/`set`/`get`/`list`) with optional `weight`

Behavior:

  • If size is set, that many subrequests are sampled.
  • If size is omitted, one subrequest from each nested generator is included.

Example

yaml
generator:
    seed: 42
    requests:
        create:
            path: "/test_create"
            name_length:
                min_value: 10
                max_value: 20
            remove_factor: 0.5

        multi:
            weight: 20
            size: 10
            get:
                path:
                    children_of: "/test_get1"
            get:
                weight: 2
                path:
                    children_of: "/test_get2"

In this example, multi is selected about 20 times more often than create. Inside multi, gets from /test_get2 are selected about 2 times more often than gets from /test_get1.


Replay request log

Replay mode reads requests from --input-request-log.

Behavior details:

  • Input format and schema are auto-detected.
  • Compressed files are supported through ClickHouse format/compression detection.
  • Replay preserves per-session request ordering via executor queues.

Supported operation kinds in logs include Create, Set, Remove, Check, CheckNotExists, Sync, Get, List, Exists, Multi, and MultiRead.

Expected log columns:

  • hostname
  • request_event_time
  • thread_id
  • session_id
  • xid
  • has_watch
  • op_num
  • path
  • data
  • is_ephemeral
  • is_sequential
  • response_event_time
  • error
  • requests_size
  • version

If --setup-nodes-snapshot-path is provided during replay, the tool can infer required initial nodes from expected outcomes and write an updated snapshot.


Output

Stderr progress reports

Periodic stderr reports (controlled by report_delay) include:

  • Total read/write request counts.
  • Read/write RPS and throughput.
  • Read/write latency percentiles (0, 10, ..., 90, 95, 99, 99.9, 99.99).
  • Per-operation breakdown (requests, RPS, p50, p99).

JSON output

Configure JSON output with:

yaml
output:
    file: "output.json"
    # or:
    file:
        path: "output.json"
        with_timestamp: true
    stdout: true

JSON fields:

  • timestamp (epoch milliseconds).
  • read_results (present only if read requests exist).
  • write_results (present only if write requests exist).
  • per_op_results (present only if per-op stats exist).

Each result object contains:

  • total_requests
  • requests_per_second
  • bytes_per_second
  • percentiles (array of { "<percent>": <latency_ms> } objects)

Troubleshooting

Common configuration exceptions:

  • No config file or hosts defined: provide --hosts or connections.
  • Both --config and --input_request_log cannot be empty: provide at least one mode input.
  • queue_depth must be >= 1 / pipeline_depth must be >= 1: set both to positive values.
  • Invalid path for request generator: all paths must start with /.
  • PathGetter has no paths after initialization: children_of parent has no children and no explicit path entries were supplied.
  • Generator weight must be >= 1: use positive weights only.
  • remove_factor must be in [0.0, 1.0]: keep probability in range.
  • Nested multi requests are not allowed: only one multi level is supported.
  • Repeating node creation ..., but name is not randomly generated: use random name when repeat is set.