docs/concepts/architecture.md
This guide covers the architectural principles and structure of NautilusTrader:
:::note Throughout the documentation, the term "Nautilus system boundary" refers to operations within the runtime of a single Nautilus node (also known as a "trader instance"). :::
The major architectural techniques and design patterns employed by NautilusTrader are:
These techniques help achieve certain architectural quality attributes.
Architectural decisions are often a trade-off between competing priorities. The following quality attributes guide design and architectural decisions, roughly in order of weighting.
NautilusTrader is incrementally adopting a high-assurance mindset: critical code paths should carry executable invariants that verify behaviour matches the business requirements. Practically this means we:
Result
surfaces, panic = abort) and add targeted formal tools only where they pay
for themselves.This approach preserves the platform’s delivery cadence while giving high-stakes flows the additional scrutiny they need.
Further reading: High Assurance Rust.
NautilusTrader draws inspiration from crash-only design principles, particularly for handling unrecoverable faults. The core insight is that systems which can recover cleanly from crashes are more robust than those with separate (and rarely tested) graceful shutdown paths.
Key principles:
:::note
The system does provide graceful shutdown flows (stop, dispose) for normal operation. These
tear down clients, persist state, and flush writers. The crash-only philosophy applies specifically
to unrecoverable faults where attempting graceful cleanup could cause further damage.
:::
This design complements the fail-fast policy, where unrecoverable errors result in immediate process termination.
References:
NautilusTrader prioritizes data integrity over availability for trading operations. The system employs a strict fail-fast policy for arithmetic operations and data handling to prevent silent data corruption that could lead to incorrect trading decisions.
The system will fail fast (panic or return an error) when encountering:
Rationale:
In trading systems, corrupt data is worse than no data. A single incorrect price, timestamp, or quantity can cascade through the system, resulting in:
By crashing immediately on invalid data, NautilusTrader aims to provide:
Panics are used for:
Results or Options are used for:
// CORRECT: Panics on overflow - prevents data corruption
let total_ns = timestamp1 + timestamp2; // Panics if result > u64::MAX
// CORRECT: Rejects NaN during deserialization
let price = serde_json::from_str("NaN"); // Error: "must be finite"
// CORRECT: Explicit overflow handling when needed
let total_ns = timestamp1.checked_add(timestamp2)?; // Returns Option<UnixNanos>
This policy is implemented throughout the core types (UnixNanos, Price, Quantity, etc.)
and helps NautilusTrader maintain strong data correctness for production trading.
In production deployments, the system is typically configured with panic = abort in release builds,
ensuring that any panic results in a clean process termination that can be handled by process supervisors
or orchestration systems. This aligns with the crash-only design principle, where unrecoverable errors
lead to immediate restart rather than attempting to continue in a potentially corrupted state.
The NautilusTrader codebase is actually both a framework for composing trading systems, and a set of default system implementations which can operate in various environment contexts.
Several core components work together to form the trading system:
NautilusKernelThe central orchestration component responsible for:
MessageBusThe backbone of inter-component communication, implementing:
CacheHigh-performance in-memory storage system that:
DataEngineProcesses and routes market data throughout the system:
ExecutionEngineManages order lifecycle and execution:
RiskEngineProvides risk management:
An environment context in NautilusTrader defines the type of data and trading venue you work with. Understanding these contexts matters for backtesting, development, and live trading.
Here are the available environments you can work with:
Backtest: Historical data with simulated venues.Sandbox: Real-time data with simulated venues.Live: Real-time data with live venues (paper trading or real accounts).The platform has been designed to share as much common code between backtest, sandbox and live trading systems as possible.
This is formalized in the system subpackage, where you will find the NautilusKernel class,
providing a common core system 'kernel'.
The ports and adapters architectural style enables modular components to be integrated into the core system, providing various hooks for user-defined or custom component implementations.
Understanding how data and execution flow through the system helps when working with the platform.
The following trace shows every step a QuoteTick takes from the network to your
strategy. Trades and bars follow the same cache-then-publish path with different
handler names. Order book deltas and depth snapshots take a different route (see
the tip below the steps).
sequenceDiagram
participant Adapter as DataClient adapter
participant Channel as MPSC channel
participant DE as DataEngine
participant Cache as Cache
participant MB as MessageBus
participant Strategy as Strategy
Adapter->>Channel: DataEvent::Data(Data::Quote(quote))
Channel->>DE: process_data(Data::Quote)
DE->>DE: handle_quote(quote)
DE->>Cache: add_quote(quote)
DE->>MB: publish_quote(topic, quote)
MB->>Strategy: on_quote_tick(quote)
Step by step:
DataClient (e.g. Binance, Bybit)
receives a WebSocket message, parses it, and constructs a QuoteTick.DataEvent::Data(Data::Quote(quote)) through an MPSC channel. In live mode
this is an async unbounded channel; in backtests the engine feeds data directly.DataEngine::process_data, which dispatches to handle_quote.handle_quote writes the quote into the Cache
via cache.add_quote(quote), making it available to any component through
self.cache.quote_tick(instrument_id).data.quotes.BINANCE.BTCUSDT-PERP). The
MessageBus finds all handlers subscribed to that topic.on_quote_tick(quote)
runs on the single-threaded kernel. The quote is already in the cache before
the handler executes, so self.cache.quote_tick(instrument_id) returns the
same quote.:::tip
For quotes, trades, and bars the cache-then-publish order means your strategy
handler can always read the latest value from the cache. Order book deltas and
depth snapshots are published directly; book state is maintained separately
through BookUpdater subscriptions.
:::
When a strategy submits an order, it flows through validation, routing, and back again as execution events:
sequenceDiagram
participant Strategy as Strategy
participant RE as RiskEngine
participant EE as ExecutionEngine
participant EC as ExecutionClient
participant Venue as Venue
Strategy->>RE: submit_order(command)
RE->>RE: pre-trade risk checks
RE->>EE: route command
EE->>EC: submit_order
EC->>Venue: place order (REST/WS)
Venue-->>EC: OrderAccepted
EC->>EE: OrderAccepted event
EE->>Strategy: on_order_accepted(event)
Venue-->>EC: OrderFilled
EC->>EE: OrderFilled event
EE->>Strategy: on_order_filled(event)
self.submit_order(order).OrderDenied
and the order never reaches the venue.ExecutionClient
for the target venue.ExecutionEngine, which updates order state in the Cache and delivers
the event to the strategy's handler. Fill events also trigger position and
portfolio updates.All components follow a finite state machine pattern. The ComponentState enum defines both stable states and transitional states:
stateDiagram-v2
[*] --> PRE_INITIALIZED
PRE_INITIALIZED --> READY : register()
READY --> STARTING : start()
STARTING --> RUNNING
RUNNING --> STOPPING : stop()
STOPPING --> STOPPED
STOPPED --> STARTING : start()
STOPPED --> RESETTING : reset()
RESETTING --> READY
RUNNING --> RESUMING : resume()
RESUMING --> RUNNING
RUNNING --> DEGRADING : degrade()
DEGRADING --> DEGRADED
DEGRADED --> STOPPING : stop()
DEGRADED --> FAULTING : fault()
RUNNING --> FAULTING : fault()
FAULTING --> FAULTED
STOPPED --> DISPOSING : dispose()
FAULTED --> DISPOSING : dispose()
DISPOSING --> DISPOSED
DISPOSED --> [*]
Stable states:
Transitional states:
start.stop.reset.dispose.degrade.fault.Transitional states are brief intermediate states that occur during state transitions. Components should not remain in transitional states for extended periods.
At the Rust implementation level, the system distinguishes between two complementary traits:
classDiagram
class Actor {
<<trait>>
+id() Ustr
+handle(message)
}
class Component {
<<trait>>
+component_id() ComponentId
+state() ComponentState
+register()
+start()
+stop()
+reset()
+dispose()
}
class ActorRegistry {
+insert(actor)
+get(id) ActorRef
}
class ComponentRegistry {
+insert(component)
+get(id) ComponentRef
}
Actor <|.. Throttler : implements
Actor <|.. Strategy : implements
Component <|.. Strategy : implements
Component <|.. DataEngine : implements
Component <|.. ExecutionEngine : implements
ActorRegistry --> Actor : manages
ComponentRegistry --> Component : manages
class Throttler {
Actor only
}
class Strategy {
Actor + Component
}
class DataEngine {
Component only
}
class ExecutionEngine {
Component only
}
Actor trait - Message dispatch:
handle method for receiving messages dispatched through the actor registry.Component trait - Lifecycle management:
start, stop, reset, dispose).register).:::note
All components can publish and subscribe to messages via the MessageBus directly - this is independent of the Actor trait. The Actor trait specifically enables the registry-based message dispatch pattern where messages are routed to a specific actor by ID.
:::
This separation allows:
Throttler).DataEngine, ExecutionEngine).The traits are managed by separate registries to support their different access patterns - lifecycle methods are called sequentially, while message handlers may be invoked re-entrantly during callbacks.
For modularity and loose coupling, an efficient MessageBus passes messages (data, commands, and events) between components.
Within a node, the kernel consumes and dispatches messages on a single thread. The kernel encompasses:
MessageBus and actor callback dispatch.This single-threaded core provides deterministic event ordering and helps maintain backtest-live parity, though live inputs and latency can still cause behavioral differences. Components consume messages synchronously in a pattern similar to the actor model.
:::note Of interest is the LMAX exchange architecture, which achieves award winning performance running on a single thread. You can read about their disruptor pattern based architecture in this interesting article by Martin Fowler. :::
Background services use separate threads or async runtimes:
These services communicate results back to the kernel via the MessageBus. The bus itself is thread-local,
so each thread has its own instance, with cross-thread communication occurring through channels that
ultimately deliver events to the single-threaded core.
The codebase organizes into layers of abstraction, grouped into logical subpackages of cohesive concepts. You can navigate to the documentation for each subpackage from the left nav menu.
core: Constants, functions and low-level components used throughout the framework.common: Common parts for assembling the frameworks various components.network: Low-level base components for networking clients.serialization: Serialization base components and serializer implementations.model: Defines a rich trading domain model.accounting: Different account types and account management machinery.adapters: Integration adapters for the platform including brokers and exchanges.analysis: Components relating to trading performance statistics and analysis.cache: Provides common caching infrastructure.data: The data stack and data tooling for the platform.execution: The execution stack for the platform.indicators: A set of efficient indicators and analyzers.persistence: Data storage, cataloging and retrieval, mainly to support backtesting.portfolio: Portfolio management functionality.risk: Risk specific components and tooling.trading: Trading domain specific components and tooling.backtest: Backtesting componentry as well as a backtest engine and node implementations.live: Live engine and client implementations as well as a node for live trading.system: The core system kernel common between backtest, sandbox, live environment contexts.The foundation of the codebase is the crates directory, containing a collection of Rust crates including a C foreign function interface (FFI) generated by cbindgen.
The bulk of the production code resides in the nautilus_trader directory, which contains a collection of Python/Cython subpackages and modules.
Python bindings for the Rust core are provided by statically linking the Rust libraries to the C extension modules generated by Cython at compile time (effectively extending the CPython API).
flowchart TB
subgraph trader["nautilus_trader
Python / Cython"]
end
subgraph core["crates
Rust"]
end
trader -->|"C API"| core
The crates/ directory contains the Rust implementation organized into focused crates with clear dependency boundaries.
Feature flags control optional functionality - for example, streaming enables persistence for catalog-based data streaming,
and cloud enables cloud storage backends (S3, Azure, GCP).
Dependency flow (arrows point to dependencies):
flowchart BT
subgraph Foundation
core
model
common
system
trading
end
subgraph Infrastructure
serialization
network
cryptography
persistence
end
subgraph Engines
data
execution
portfolio
risk
end
subgraph Runtime
live
backtest
end
adapters
pyo3
model --> core
common --> core
common --> model
system --> common
trading --> common
serialization --> model
network --> common
network --> cryptography
persistence --> serialization
data --> common
execution --> common
portfolio --> common
risk --> portfolio
live --> system
live --> trading
backtest --> system
backtest --> persistence
adapters --> live
adapters --> network
pyo3 --> adapters
Crate categories:
| Category | Crates | Purpose |
|---|---|---|
| Foundation | core, model, common, system, trading | Primitives, domain model, kernel, actor & strategy base. |
| Engines | data, execution, portfolio, risk | Core trading engine components. |
| Infrastructure | serialization, network, cryptography, persistence | Encoding, networking, signing, storage. |
| Runtime | live, backtest | Environment‑specific node implementations. |
| External | adapters/* | Venue and data integrations. |
| Bindings | pyo3 | Python bindings. |
Feature flags:
| Feature | Crates | Effect |
|---|---|---|
streaming | data, system, live | Enables persistence dependency for catalog streaming. |
cloud | persistence | Enables cloud storage backends (S3, Azure, GCP, HTTP). |
python | most crates | Enables PyO3 bindings (auto‑enables streaming, cloud). |
defi | common, model, data | Enables DeFi/blockchain data types. |
:::note Both Rust and Cython are build dependencies. The binary wheels produced from a build do not require Rust or Cython to be installed at runtime. :::
The platform design prioritizes software correctness and safety.
The Rust codebase under crates/ relies on the rustc compiler's guarantees for safe code.
Any unsafe blocks are explicit opt-outs where we must uphold the required invariants ourselves
(see the Rust section of the Developer Guide); overall memory and type safety
depend on those invariants holding.
Cython provides type safety at the C level at both compile time, and runtime:
:::info
If you pass an argument with an invalid type to a Cython implemented module with typed parameters,
then you will receive a TypeError at runtime.
:::
If a function or method's parameter is not explicitly typed to accept None, passing None as an
argument will result in a ValueError at runtime.
:::warning The above exceptions are not explicitly documented to prevent excessive bloating of the docstrings. :::
The documentation aims to cover all possible exceptions that NautilusTrader code can raise, and the conditions that trigger them.
:::warning There may be other undocumented exceptions which can be raised by Python's standard library, or from third party library dependencies. :::
:::warning[One node per process]
Running multiple TradingNode or BacktestNode instances concurrently in the same process is not supported due to global singleton state:
_FORCE_STOP global flag is shared across all engines in the process.OnceLock instances are process-wide.Sequential execution of multiple nodes (one after another with proper disposal between runs) is fully supported and used in the test suite.
For production deployments, add multiple strategies to a single TradingNode within a process. For parallel execution or workload isolation, run each node in its own separate process. :::