Back to Spacetimedb

Determinism Coverage

crates/runtime/DETERMINISM_COVERAGE.md

2.5.04.6 KB
Original Source

Determinism Coverage

This document tracks which sources of nondeterminism are under control in spacetimedb-runtime, which ones are only constrained by current architecture, and which ones still escape the simulator boundary.

It is meant to serve two purposes:

  1. Make the current determinism boundary explicit for runtime code, core crates, and DST harnesses.
  2. Provide a place to record and review assumptions when a PR changes that boundary.

Status Definitions

  • Controlled The simulator or runtime owns this source of nondeterminism directly. Given the same seed and the same simulated inputs, behavior should replay the same way.

  • Constrained This surface is not fully simulator-controlled, but the current architecture limits how it is used. Replay should remain stable if those constraints continue to hold.

  • Audited This surface is not mechanically controlled. Current usage has been reviewed and is believed not to affect replay, but that guarantee depends on call patterns and can regress.

  • Known Leak This source can currently escape simulator control and affect replay. It should be treated as explicit technical debt or a documented exception.

  • Out of Scope This crate does not try to control this surface. If it matters for DST, it must be modeled by a higher-level abstraction or test harness.

Control Matrix

SurfaceStatusBoundaryCurrent control or assumptionFailure mode if violatedRequired direction
Executor schedulingControlledruntime::sim::executorRunnable selection is driven by seeded simulator RNGReplay diverges across runs-
Virtual time and timersControlledruntime::sim::timeSimulated time advances only through explicit advance or next-timer jumpTimeouts and ordering become host-timing dependent-
Runtime RNG and buggifyControlledruntime::sim::rngRuntime RNG drives scheduler and probabilistic fault-injection decisionsRNG and fault decisions are not replayable-
OS thread creation during simulationControlledruntime::sim_stdUnix thread hook rejects std::thread::spawn while simulation is activeHost scheduler escapes simulator control-
OS entropyKnown Leakruntime::sim_stdRandomness requests warn and then delegate to the OSSame seed can produce different tracesAdd backtrace to warnings, remove call sites, eventually fail closed or fully model the source
HashMap randomized iterationAuditedRuntime and caller codeRuntime does not force deterministic hash seeding; correctness must not depend on iteration orderHidden ordering dependencies cause flaky replayPrefer ordered maps or explicit sorting where observable order matters
tokio::sync primitivesConstrainedCore crates above runtimeThese can be replay-compatible only when all participating tasks remain simulator-owned and progress stays on simulator-controlled async pathsWake ordering or blocking semantics diverge once code depends on a real runtime or host-driven progressAudit per primitive and push deep-core paths toward runtime-owned or single-threaded structures
parking_lot::{} and std::sync::{}ConstrainedCore crates, especially datastoreSafe only where access stays single-threaded or non-contended under DSTHost synchronization leaks nondeterministic acquisition orderKeep out of deep-core execution paths; prefer runtime-owned or single-threaded structures
File and network I/OOut of ScopeRuntime crateRuntime does not simulate filesystem or network behaviorReal I/O timing, ordering, and errors are not replayableModel via domain-specific DST abstractions
Heap allocation and OOMKnown LeakBroad, especially deep-core directionAllocation happens through normal Rust paths; deterministic allocation failure is not modeledResource-exhaustion behavior is not reproducibleMove the simulation core and eventually deep-core paths toward no_std + alloc with explicit allocation boundaries
Snapshot / commitlog / datastore host effectsOut of ScopeHigher-level durability and storage layersRuntime only provides scheduling, time, and fault-decision primitivesStorage semantics depend on real host behavior unless wrappedModel durable behavior through domain-specific DST abstractions

Update Rule

A PR should update this document if it:

  • introduces a new source of nondeterminism,
  • changes the control status of an existing surface,
  • adds a new assumption about single-threading, iteration order, runtime ownership, or host behavior, or
  • removes a leak or upgrades a surface from Audited or Constrained to Controlled.