Back to Nautilus Trader

Hyperliquid Adapter Benchmarks

crates/adapters/hyperliquid/benches/BENCHMARKS.md

1.228.06.0 KB
Original Source

Hyperliquid Adapter Benchmarks

Unless a section states otherwise, numbers were measured 2026-05-18 on AMD Ryzen Threadripper 9980X under rustc 1.95.0, bench-lto profile (release opts + lto = "fat" + codegen-units = 1, debug = full). The CPU governor is pinned to performance and ASLR is disabled via setarch -R for the run.

Refresh on substantive perf change or before release; bump the date. Absolute numbers vary by machine; only same-machine deltas are meaningful.

How to reproduce

bash
sudo cpupower frequency-set -g performance
setarch -R cargo bench -p nautilus-hyperliquid --profile bench-lto \
    --bench data --bench exec --bench micros --bench signing
sudo cpupower frequency-set -g powersave  # restore default

For policy and the general noise-reduction recipe see BENCHMARKING.md at the repo root.

Inbound pipeline (data.rs)

Raw WS frame bytes -> Nautilus domain type. Covers decode + parse + cache lookup + Nautilus type construction. No I/O, no async runtime, no channel.

Rows ordered from the most fundamental market-data stream (book deltas) down through derived streams (mark/index/funding/bars), then the private user streams (fills, order updates) at the end.

BenchMedianThroughput
inbound_pipeline/book_deltas3.95 µs253 k/s
inbound_pipeline/book_depth103.98 µs251 k/s
inbound_pipeline/quotes557 ns1.79 M/s
inbound_pipeline/trades618 ns1.62 M/s
inbound_pipeline/mark_price886 ns1.13 M/s
inbound_pipeline/index_price896 ns1.12 M/s
inbound_pipeline/funding_rate897 ns1.11 M/s
inbound_pipeline/bars652 ns1.53 M/s
inbound_pipeline/order_event830 ns1.20 M/s
inbound_pipeline/order_fill1.11 µs899 k/s

Execution pipeline (exec.rs)

Strategy command (OrderAny / cancel / modify) -> fully signed wire bytes ready to POST. Covers normalize + serialize (msgpack) + EIP-712 sign.

BenchMedianThroughput
exec_pipeline/submit_market42.2 µs23.7 k/s
exec_pipeline/submit_limit42.1 µs23.7 k/s
exec_pipeline/submit_stop_market42.5 µs23.5 k/s
exec_pipeline/cancel47.9 µs20.9 k/s
exec_pipeline/modify42.2 µs23.7 k/s

Signing (signing.rs)

Measured 2026-05-22 on the same host and profile. Direct signer and serialization benches isolate L1 signing from order normalization and POST body construction.

BenchMedian
sign_l1_action41.8 µs
sign_l1_action_with_vault42.2 µs
signer_construction31.3 µs
msgpack_serialize_action172 ns
json_serialize_action329 ns

Dispatch (exec.rs)

Venue report (FillReport, OrderStatusReport) -> events emitted via ExecutionEventEmitter. Covers dedup + identity lookup + event construction.

Note: these numbers include per-iteration WsDispatchState construction + drop, which is a bench-only artifact. In production, state lives forever and the dispatch-only cost is much smaller (see atom/dispatch_fill_reused in the component breakdown below).

BenchMedianThroughput
dispatch/fill15.6 µs64.2 k/s
dispatch/status_accepted11.1 µs90.5 k/s
dispatch/status_canceled15.4 µs64.8 k/s
dispatch/status_modified12.3 µs81.1 k/s

Component breakdown (micros.rs)

Diagnostic benches that decompose the pipeline numbers above. Use these to localise where time goes when a pipeline bench regresses.

BenchMedian
decode_only/trade549 ns
decode_only/book3.25 µs
parse_only/trade59.0 ns
parse_only/book_deltas678 ns
atom/decimal_from_str7.04 ns
atom/price_from_decimal_dp6.38 ns
atom/price_combined12.1 ns
atom/trade_id_new17.9 ns
atom/uuid4_new58.7 ns
atom/state_construct_primed7.39 µs
atom/state_drop_primed1.48 µs
atom/event_filled_construct151 ns
atom/event_accepted_construct148 ns
atom/dispatch_fill_reused12.6 ns

Notes

  • Inbound is JSON-decode dominated. decode_only accounts for roughly 80-90% of the inbound pipeline cost across every message kind. Decimal, Price, Quantity, UUID4, and TradeId construction are all sub-100 ns and not meaningful in the absolute pipeline number.
  • Exec is signature-bound. EIP-712 + keccak + secp256k1 dominates, and lto = "fat" collapses the per-variant differences so submit and modify converge at ~42 µs. Cancel sits at ~48 µs because the cancel action serialises a different msgpack shape. Optimisations that don't change the signing scheme won't move these numbers.
  • Dispatch in production is faster than the bench suggests. The canonical bench rebuilds state per iteration; the steady-state cost on a reused state is ~13 ns on a dedup hit, and a first-time fill on a fresh state is ~7 µs (dispatch/fill minus state_construct_primed + state_drop_primed).
  • simd-json was piloted and reverted. A simd-json feature flag plus decode helper was prototyped, run side-by-side against serde_json, and found to be 20-50% slower on hyperliquid payload sizes. The mutable- buffer requirement forces a per-call to_vec(), payloads are too small to amortise SIMD setup, and owned-String deserialization negates the borrow advantage. Re-evaluate only if payloads grow materially or a zero-copy borrowed-string path lands.