Back to Questdb

TSBS guide for QuestDB

artifacts/tsbs/README.md

9.3.57.3 KB
Original Source

TSBS guide for QuestDB

This guide is provided as supplementary documentation for the Time Series Benchmark Suite (TSBS) used to benchmark several time series databases. This document explains how the data for TSBS is generated along with additional flags available when using the data importer (tsbs_load_questdb). This guide should be read after the TSBS README.

Data format

Data generated by tsbs_generate_data is in InfluxDB line protocol format where each reading is composed of the following:

  • the table name followed by a comma
  • several comma-separated items of tags in the format <label>=<value> followed by a space
  • several comma-separated items of fields in the format <label>=<value> followed by a space
  • a timestamp for the record
  • a newline character \n

An example reading from the iot use case looks like the following:

text
diagnostics,name=truck_3985,fleet=West,driver=Seth,model=H-2,device_version=v1.5 load_capacity=1500,fuel_capacity=150,nominal_fuel_consumption=12,fuel_state=0.8,current_load=482,status=4i 1451609990000000000

How to run the test

Firstly, install and build the benchmark suite. This can be done in a temporary directory for the Go binaries.

bash
mkdir -p ~/tmp/go/src/github.com/timescale/
cd ~/tmp/go/src/github.com/timescale/

Clone the TSBS repository, build test and install Go binaries:

bash
git clone [email protected]:timescale/tsbs.git
cd tsbs

GOPATH=~/tmp/go go build -v ./...
GOPATH=~/tmp/go go test -v github.com/timescale/tsbs/cmd/tsbs_load_questdb
GOPATH=~/tmp/go go install -v ./...

Generating data

Data is generated using the influx format. To generate a small dataset for quick benchmarks:

bash
~/tmp/go/bin/tsbs_generate_data \
--use-case="cpu-only" --seed=123 --scale=4000 \
--timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T01:00:00Z" \
--log-interval="10s" --format="influx" > /tmp/data

To generate a full data set:

bash
~/tmp/go/bin/tsbs_generate_data \
--use-case="cpu-only" --seed=123 --scale=4000 \
--timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-02T00:00:00Z" \
--log-interval="10s" --format="influx" > /tmp/data

Running the benchmark tool

Generated data can be loaded directly using the tool:

bash
~/tmp/go/bin/tsbs_load_questdb --file /tmp/data --workers 4

Alternatively, shell scripts are provided which can be used to generate and load data:

bash
cd ~/tmp/go/src/github.com/timescale/

# generates data file /tmp/bulk_data/influx-data.gz
PATH=${PATH}:~/tmp/go/bin FORMATS=influx TS_END=2016-01-01T02:00:00Z bash ./scripts/generate_data.sh
# load data into QuestDB
PATH=${PATH}:~/tmp/go/bin NUM_WORKERS=1 ./scripts/load/load_questdb.sh

Query benchmarks for single-groupby-5-8-1

bash
cd ~/tmp/go/src/github.com/timescale/

~/tmp/go/bin/tsbs_generate_queries \
--use-case="cpu-only" --seed=123 --scale=4000 \
--timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-02T00:00:01Z" \
--queries=1000 --query-type="single-groupby-5-8-1" \
--format="questdb" > /tmp/queries_questdb

~/tmp/go/bin/tsbs_run_queries_questdb --file /tmp/queries_questdb --print-interval 500

Benchmarking influxdb and questdb on FreeBSD

The following commands build and install the TSBS tool, this step can be skipped if installation has already been performed according to the instructions above.

bash
mkdir -p ~/tmp/go/src/github.com/timescale/
cd ~/tmp/go/src/github.com/timescale/

git clone [email protected]:timescale/tsbs.git
cd tsbs

GOPATH=~/tmp/go go build -v ./...
GOPATH=~/tmp/go go test -v github.com/timescale/tsbs/cmd/tsbs_load_questdb
GOPATH=~/tmp/go go install -v ./...

Install InfluxDB:

bash
sudo portinstall influxdb

The TimescaleDB vs InfluxDB benchmark blog generates data in cpu-only:

bash
# Generate data
~/tmp/go/bin/tsbs_generate_data --use-case="cpu-only" --seed=123 --scale=4000 \
  --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-02T00:00:00Z" \
  --log-interval="10s" --format="influx" > /tmp/data
# Start InfluxDB bench
cat /tmp/data | ~/tmp/go/bin/tsbs_load_influx

To run against QuestDB:

bash
# Show help
~/tmp/go/bin/tsbs_load_questdb -help
# Start QuestDB bench
~/tmp/go/bin/tsbs_load_questdb --file /tmp/data --workers 4

Alternatively, data may be generated and loaded using the shell scripts:

bash
cd ~/tmp/go/src/github.com/timescale/
# generates a data file '/tmp/bulk_data/influx-data.gz'
PATH=${PATH}:~/tmp/go/bin FORMATS=influx TS_END=2016-01-01T02:00:00Z bash ./scripts/generate_data.sh
PATH=${PATH}:~/tmp/go/bin NUM_WORKERS=1 ./scripts/load/load_questdb.sh

Query benchmarks

This query benchmark assumes an iot data set including periods 2016-01-01T00:00:00Z -> 2016-01-02T00:00:01Z has been loaded. Queries are generated using the questdb format.

single-groupby-5-8-1:

The dataset used to run the queries is created with the following commands for single-groupby-5-8-1:

bash
cd ~/tmp/go/src/github.com/timescale/

# Run 'single-groupby-5-8-1'
~/tmp/go/bin/tsbs_generate_queries --use-case="cpu-only" --seed=123 --scale=4000 \
  --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-02T00:00:01Z" \
  --queries=1000 --query-type="single-groupby-5-8-1" --format="questdb" > /tmp/queries_questdb

~/tmp/go/bin/tsbs_run_queries_questdb --file /tmp/queries_questdb --print-interval 500

Examples of the generated queries can be found on the CPU-only example queries document. Alternatively, query benchmarks may be run using the shell scripts

bash
PATH=${PATH}:~/tmp/go/bin FORMATS=questdb TS_END=2016-01-02T00:00:00Z bash ./scripts/generate_queries.sh
PATH=${PATH}:~/tmp/go/bin ./scripts/run_queries/run_queries_questdb.sh

tsbs_load_questdb flags

--batch-size (type: uint, default: 10000)

Number of items to batch together in a single insert.

--db-name (type: string, default: benchmark)

Name of database.

--do-abort-on-exist (type: boolean, default: false)

Whether to abort if a database with the given name already exists.

--do-create-db (type: boolean, default: true)

Whether to create the database. Disable on all but one client if running on a multi-client setup.

--do-load (type: boolean, default: true)

Whether to write data. Set this flag to false to check input read speed.

--file (type: string, default: none)

File name to read data from.

--ilp-bind-to (type: string, default 127.0.0.1:9009)

QuestDB InfluxDB line protocol TCP port in the format <ip>:<port>

--limit (type: uint, default: 0)

Number of items to insert where 0 is all items.

--reporting-period (type: duration, default: 10)

Period in seconds to report write stats

--seed (type: int, default: 0)

PRNG seed. The default value of 0 uses the current timestamp as a seed.

--url (type: string, default: http://localhost:9000/)

QuestDB REST end point.

--workers (type: uint, default: 1)

Number of parallel clients inserting.

-help

Prints available flags with defaults:

bash
~/tmp/go/bin/tsbs_load_questdb -help