Back to Spacetimedb

SpacetimeDB Benchmark Suite

templates/keynote-2/README.md

2.3.013.1 KB
Original Source

SpacetimeDB Benchmark Suite

A benchmark suite comparing SpacetimeDB against traditional web application stacks for transactional workloads.

Quick Demo

See SpacetimeDB's performance advantage with one command:

bash
pnpm install
pnpm run demo

The demo compares SpacetimeDB and Convex by default, since both are easy for anyone to set up and run locally without additional infrastructure. Other systems (Postgres, CockroachDB, SQLite, etc.) are also supported but require more setup. The demo checks that required services are running (prompts you to start them if not), seeds databases, and displays animated results.

Options: --systems a,b,c | --seconds N | --concurrency N | --alpha N | --skip-prep | --no-animation

Note: demo always runs the built-in test-1 scenario. Use bench if you need to specify a test name directly. Note: demo selects targets with --systems; bench filters test connectors with --connectors.

Results Summary

For all tests, we ran N clients where N is 2x the number of CPUs on the database machine used for the test. Exact client counts are shown in each row. The workload is a transfer transaction (read-modify-write transaction between two accounts).

The SpacetimeDB rows were obtained using a single-node SpacetimeDB Standalone instance, so the published numbers are reproducible with the public, downloadable server.

Each row reports mean TPS and sample standard deviation of per-second throughput within a single 300-second run. alpha=1.5 corresponds to ~80% contention. When standard deviation approaches or exceeds mean TPS, throughput is unstable across the run.

Data description: reported summary metrics are computed from steady-state windows after a 30-second warmup (tSec >= 30), using the recorded per-second timeSeries data.

Alpha = 0

Systemclientspipeliningmax_poolTPSTPS Stddevp50 lat msp99 lat ms
SpacetimeDB6440N/A279,0244,763812
Node.js + SQLite64offN/A3,121801940
Node.js + Supabase64off647,3621,179618
Bun + Postgres64off6410,729146511
Node.js + Postgres64off649,904223611
Node.js + PlanetScale (SN)64off644,5351171420
Node.js + PlanetScale (HA)384off3844,27513589110
Convex64offN/A1,1401185362
Node.js + CockroachDB (5 node)320off3204,25356171120
HAProxy - Node.js + CockroachDB (5 node)320off3205,4815665795

Alpha = 1.5

Systemclientspipeliningmax_poolTPSTPS Stddevp50 lat msp99 lat ms
SpacetimeDB6440N/A303,9194,712711
Node.js + SQLite64offN/A3,188731839
Node.js + Supabase64off642,534572197
Bun + Postgres64off642,77261713
Node.js + Postgres64off64961251016
Node.js + PlanetScale (SN)64off6423512202,504
Node.js + PlanetScale (HA)384off3842481341610,121
Convex64offN/A12652201,081
Node.js + CockroachDB (5 node)320off3200.030.186989,695
HAProxy - Node.js + CockroachDB (5 node)64off646.879.125,9439,880

Note: the HAProxy + CockroachDB alpha=1.5 row uses 64 clients (instead of 320) because 320-way concurrency overwhelmed CRDB and did not produce stable sample data for this profile.

Alpha = 0 (All-Connectors Pipelining Check)

The headline comparison allows pipelining only for SpacetimeDB. This separate check enables pipelining for every connector to show how the other systems behave when clients submit up to 40 requests without waiting for each response.

Systemclientspipeliningmax_poolTPSTPS Stddevp50 lat msp99 lat ms
Node.js + SQLite6440N/A2,97784722747
Node.js + Supabase6440648,874308284303
Bun + Postgres64406410,184120250.1260.5
Node.js + Postgres6440649,165145276290
Node.js + PlanetScale (SN)6440644,32585590604
Node.js + PlanetScale (HA)384403843,3553274,3544,438
Convex6440N/A1,1541342,1192,150
Node.js + CockroachDB (5 node)320403204,2507663,0303,161
HAProxy - Node.js + CockroachDB (5 node)320403205,9921,7652,4312,562

Key Finding: In these runs, SpacetimeDB is the only system sustaining hundreds of thousands of TPS in both alpha profiles. At alpha=0, the strongest non-SpacetimeDB results are in the ~10k TPS range, while at alpha=1.5 several systems show severe contention sensitivity with large tail-latency growth and throughput collapse.

Methodology

All systems were tested with out-of-the-box database and platform settings, with one exception: the local Postgres instance (and Bun, which uses the same Postgres instance) is configured with default_transaction_isolation = 'serializable'. For Postgres-like RPC servers, the app-side Drizzle connection pool is configured as shown in the result tables, and the benchmark connects directly to Postgres.

The managed Postgres services (Supabase, PlanetScale) run at their default isolation level of READ COMMITTED.

Throughput is counted from successful operations that the benchmark client observes completing inside the configured test window for every system.

Published Benchmark Defaults

The reported tables in this README use the following profile defaults unless a row explicitly shows a different value:

  • clients: N clients where N is 2x the number of CPUs on the database machine used for the test
  • pipelining: off for non-pipelined runs
  • MAX_POOL: 64 for pg-based RPC servers (postgres_rpc, cockroach_rpc, supabase_rpc, planetscale_pg_rpc)
  • Main comparison runs use MAX_INFLIGHT_PER_WORKER=40 for SpacetimeDB only
  • All-connectors pipelining-check runs use BENCH_PIPELINED=1 and MAX_INFLIGHT_PER_WORKER=40
  • When BENCH_PIPELINED=1, set MAX_INFLIGHT_PER_WORKER explicitly in the environment

For rows that scale client count above 64 (for example, some HA topologies), max_pool is scaled to match the row values shown in the table.

Test Architecture

All benchmarks follow an apples-to-apples comparison using the same architecture pattern:

Client → Web Server (HTTP) → ORM (Drizzle) → Database

Or for integrated platforms (SpacetimeDB, Convex):

Client → Integrated Platform (compute + storage colocated)

This ensures we're measuring real-world application performance, not raw database throughput.

Machine Topology

The reported numbers use a single benchmark host wherever possible. This means client, server, and database were all run on the same machine.

We did this mainly because it was the most favorable benchmarking setup for the competitor platforms, because it minimizes server to database latency, but also because it allows others to easily reproduce the results.

For completeness, we also tested separated-machine topologies, where the benchmark client, server, and database processes were not colocated on one machine. However, in each case we found that doing so either did not change or reduced the throughput of other systems due to the additional network hop. We published the most favorable numbers for our competitors.

The platforms that cannot use this exact topology are PlanetScale and CockroachDB. PlanetScale operates a managed cloud database and does not have a self-hosted variant of the service, so the benchmark client and RPC server are colocated on a benchmark host in the same region and availability zone as the database host. CockroachDB is a distributed database running across multiple nodes, so the benchmark client and RPC server cannot be colocated with the database on a single node.

The Transaction

Each transaction performs a fund transfer between two accounts:

  1. Read both source and destination account balances
  2. Verify sufficient funds in source account
  3. Debit source account
  4. Credit destination account
  5. Commit transaction with row-level locking

This is a classic read-modify-write workload that tests transactional integrity under concurrent access.

Test Command

The numbers in the table above were collected with pnpm run bench:

bash
pnpm install
pnpm run prep                                                              # seed all backing databases once
pnpm run bench --alpha 0,1.5 --connectors <connectors> --seconds 300       # one JSON per (connector, alpha)

--alpha and --connectors both accept comma-separated values. The bench writes one JSON per (connector, alpha, run) tuple into runs/.

When aggregating these JSONs into summary tables, use a 30-second warmup cutoff (--warmup-sec 30) to match the published numbers.

Useful flags:

  • --alpha <csv>: Zipf alpha. This benchmark reports 0 (uniform / ~0% contention) and 1.5 (Zipf / ~80% contention).
  • --connectors <csv>: which connectors to run. Defaults to every test in src/tests/test-1/.
  • --seconds <num>: duration of each run.
  • --concurrency <num>: number of concurrent clients (default: 64).
  • --runs <num>: repeat each (connector, alpha) combination this many times (default: 1). Each repeat writes its own JSON.
  • --prep-between-alphas: run pnpm run prep before each (connector, alpha) combination to reset DB state.
  • --stdb-compression <none|gzip>: SpacetimeDB client compression mode (default: none).

Hardware Configuration

Server Machine (all systems except PlanetScale):

  • PhoenixNAP s3.c3.medium bare metal instance - Intel i9-14900k 24 cores (32 threads), 128GB DDR5 Memory, OS: Ubuntu 24.04

Bench client for PlanetScale:

  • AWS m7i.8xlarge in us-east-2, colocated with the PlanetScale cluster. Clusters tested: PS-2560 single-node EBS, M-15360 Metal HA (1 primary + 2 replicas). Both Postgres 18.3.

Account Seeding

  • 100,000 accounts seeded before each benchmark
  • Initial balance: 1,000,000,000 per account
  • Zipf distribution controls which accounts are selected for transfers

Technical Notes

Why SpacetimeDB Outperforms Traditional Stacks

The primary bottleneck in traditional web application architectures is the round-trip latency between the application server and database:

Traditional: Client → Server → Database → Server → Client
                        ↑___________↑
                     Network round-trip per query

SpacetimeDB eliminates this by colocating compute and storage:

SpacetimeDB: Client → SpacetimeDB (compute + storage) → Client

This architectural difference means SpacetimeDB can execute transactions in microseconds rather than milliseconds, resulting in order-of-magnitude performance improvements.

Client Pipelining

The benchmark supports pipelining for all clients - sending multiple requests without waiting for responses. The headline comparison uses this for SpacetimeDB only; the all-connectors pipelining check enables it across systems.

Confirmed Reads (withConfirmedReads)

SpacetimeDB supports withConfirmedReads mode which ensures transactions are durably committed before acknowledging to the client. The benchmark results shown use withConfirmedReads = ON for fair comparison with databases that provide similar durability guarantees.

Cloud vs Local Results

PlanetScale results (~280 TPS under high contention, regardless of cluster tier) demonstrate the significant impact of cloud database latency. When the database is accessed over the network (even within the same cloud region), round-trip latency dominates performance. This is why SpacetimeDB's colocated architecture provides such dramatic improvements.

Systems Tested

SystemArchitecture
SpacetimeDB StandaloneIntegrated platform; single-node downloadable server.
SQLite + Node HTTP + DrizzleNode.js HTTP server → Drizzle ORM → SQLite
Bun + Drizzle + PostgresBun HTTP server → Drizzle ORM → PostgreSQL
Postgres + Node HTTP + DrizzleNode.js HTTP server → Drizzle ORM → PostgreSQL
Supabase + Node HTTP + DrizzleNode.js HTTP server → Drizzle ORM → Supabase (Postgres)
CockroachDB + Node HTTP + DrizzleNode.js HTTP server → Drizzle ORM → CockroachDB
PlanetScale + Node HTTP + DrizzleNode.js HTTP server → Drizzle ORM → PlanetScale (Cloud)
ConvexIntegrated platform

Running the Benchmarks

See DEVELOP.md for prerequisites, configuration, and full CLI reference.

Output

Benchmark results are written to ./runs/ as JSON files with TPS and latency statistics.

License

See repository root for license information.