docs/mintlify/reference/architecture/distributed.mdx
Distributed Chroma is designed for large-scale production workloads. Its components run as independent services so the system can scale horizontally while keeping a consistent API for clients.
Regardless of deployment mode, Chroma is composed of five core components. Each plays a distinct role in the system and operates over the shared Chroma data model.
The gateway is the entrypoint for client traffic.
The log is Chroma's write-ahead log.
The query executor is responsible for all read operations.
The compactor periodically builds and maintains indexes.
The system database is Chroma's internal catalog.
In distributed mode, Chroma's components are deployed independently.
This design separates compute from storage and lets Chroma scale collections and traffic without tying the whole system to a single machine.
Distributed Chroma is built on object storage to provide durable, low-cost storage at large scale. Object storage can deliver very high throughput, but it also introduces a higher baseline latency than local disk.
To reduce that latency penalty, Chroma aggressively uses SSD caching. When a collection is first queried, a subset of the required data is fetched from object storage, which can add cold-start latency. As the SSD cache warms, queries can be served from local cache instead of repeatedly hitting object storage.