Back to Rustfs

ARCHITECTURE.md

ARCHITECTURE.md

1.0.0-beta.115.0 KB
Original Source

ARCHITECTURE.md

Last updated: 2026-04-13 · Revision: 1 (draft)

This document describes the high-level architecture of RustFS. If you want to familiarize yourself with the code base, you are in the right place!

See also CONTRIBUTING.md for development workflow.

Bird's Eye View

RustFS is a high-performance, S3-compatible distributed object storage system written in Rust. It uses erasure coding for data durability, supports multi-tenancy through IAM/STS, and provides a web-based admin console.

A running RustFS node exposes:

  • S3 API (port 9000) — the primary data path for object CRUD
  • Admin API (port 9000, /minio/ prefix) — cluster management, IAM, metrics
  • Console (port 9001) — web UI backed by the Admin API
  • Inter-node RPC (gRPC/tonic) — cluster communication for distributed mode

The core data flow for a PUT request looks like:

HTTP request
  → server (TLS, auth, routing, compression)
    → app/object_usecase (validation, policy, lifecycle)
      → storage/ecfs (erasure coding, encryption, checksums)
        → ecstore (disk pool selection, data distribution)
          → rio (reader pipeline: encrypt → compress → hash → write)
            → io-core (zero-copy I/O, buffer pool, direct I/O)
              → local disk / remote disk via RPC

Code Map

The repository is a Cargo workspace with a flat crates/ layout:

rustfs/                      # Workspace root (virtual manifest)
├── rustfs/                  # Main binary + library crate (75K lines)
│   └── src/
│       ├── main.rs          # Entry point, startup sequence
│       ├── lib.rs           # Module tree root
│       ├── server/          # HTTP server, TLS, routing, middleware
│       ├── admin/           # Admin API handlers and console
│       ├── app/             # Use-case layer (object, bucket, multipart)
│       ├── storage/         # Storage engine interface and implementation
│       ├── auth.rs          # S3 request authentication
│       ├── config/          # CLI args, config parsing, workload profiles
│       └── ...
├── crates/                  # 39 library crates
│   ├── ecstore/             # Erasure-coded storage engine (⚠️ 87K lines)
│   ├── rio/                 # Reader I/O pipeline (encrypt, compress, hash)
│   ├── io-core/             # Zero-copy I/O, scheduling, buffer pool
│   ├── io-metrics/          # I/O metrics collection
│   ├── common/              # Shared runtime state, globals, data usage types
│   ├── config/              # Configuration types and parsing
│   ├── utils/               # Pure utility functions
│   ├── ...                  # (see "Crate Reference" below)
│   └── e2e_test/            # End-to-end integration tests
└── docs/                    # Design documents and analysis

Main Crate Layers (rustfs/src/)

The main crate is organized in layers, top to bottom:

LayerDirectoryResponsibility
Serverserver/HTTP listener, TLS, CORS, compression, middleware, graceful shutdown
Adminadmin/Admin API routing, 30+ handler modules, web console
Appapp/Use-case orchestration: object_usecase, bucket_usecase, multipart_usecase
Storagestorage/S3 API translation, erasure-coded FS, SSE encryption, RPC, concurrency
Authauth.rsS3 signature verification, credential validation
Configconfig/CLI parsing, config struct, workload profiles

A request flows downward through the layers. No layer should reach upward (e.g., storage must not import from admin).

Crate Reference

Crates are organized in a dependency DAG with 9 depth levels (0 = leaf, 8 = top):

Depth 0 — LEAF (no internal deps):
  appauth, checksums, config, credentials, crypto, io-metrics,
  madmin, s3-common, workers, zip

Depth 1:
  io-core (→ io-metrics)
  policy (→ config, credentials, crypto)
  utils (→ config)                        ⚠️ inverted: utils should be leaf

Depth 2:
  concurrency, filemeta, keystone, kms, lock, obs,
  signer, targets, trusted-proxies

Depth 3:
  common (→ filemeta, madmin)             ⚠️ inverted: common should be leaf

Depth 4:
  object-capacity, protos, rio

Depth 5 — CORE:
  ecstore (16 internal deps, 11 dependents — the architectural heart)

Depth 6:
  audit, heal, iam, metrics, notify, s3select-api, scanner

Depth 7:
  object-io, protocols, s3select-query

Depth 8 — TOP:
  rustfs (35 internal deps — the binary, depends on almost everything)

By Domain

Core Infrastructure:

CrateLinesPurpose
config3.3KConfiguration types and environment parsing
utils8.7KPure utilities (paths, compression, network, retry)
common4.4KShared runtime state, globals, data usage types, metrics
madmin5.5KAdmin API request/response types

I/O Pipeline:

CrateLinesPurpose
io-core6.5KZero-copy I/O, buffer pool, direct I/O, scheduling, backpressure
io-metrics4.5KI/O operation metrics and counters
rio6.9KComposable reader chain (encrypt → compress → hash → limit)
object-io2.4KHigh-level object read/write using rio + ecstore
concurrency1.8KConcurrency control wrappers over io-core

Storage Engine:

CrateLinesPurpose
ecstore87K⚠️ Erasure-coded storage: disks, pools, buckets, replication, lifecycle
filemeta10KFile/object metadata types and versioning
checksums732Checksum computation
lock7.1KDistributed lock manager
heal5.9KData healing / bitrot repair
scanner5.4KBackground data usage scanner
object-capacity2.5KCapacity tracking and management

Security & Auth:

CrateLinesPurpose
crypto1.6KEncryption primitives
credentials713Credential types (access key / secret key)
signer1.4KS3 v4 request signing
iam9.0KIdentity and access management
policy8.8KPolicy engine (S3 bucket/IAM policies)
kms8.1KKey management service integration
keystone1.9KOpenStack Keystone auth
appauth143Application-level auth tokens

Protocol & API:

CrateLinesPurpose
protos5.7KProtobuf/gRPC definitions for inter-node RPC
protocols18KFTP/FTPS, WebDAV, Swift API support
s3-common738Shared S3 types
s3select-api1.9KS3 Select interface
s3select-query3.6KS3 Select query engine

Observability:

CrateLinesPurpose
metrics8.4KPrometheus metric collectors
io-metrics4.5KI/O-specific metrics
obs5.6KOpenTelemetry tracing and telemetry
audit2.4KAudit logging

Events:

CrateLinesPurpose
notify5.5KEvent notification system
targets3.2KNotification targets (Kafka, AMQP, webhook, etc.)

Other:

CrateLinesPurpose
trusted-proxies4.0KTrusted proxy / IP forwarding
zip986ZIP archive support for bulk downloads
workers136Simple worker abstraction

Architecture Invariants

These are rules that the codebase should follow. Some are currently violated (marked with ⚠️). Documenting them here makes the violations explicit and trackable.

  1. Layers flow downward. Server → Admin/App → Storage → ecstore → rio/io-core. No upward imports.

  2. Leaf crates have zero internal dependencies. config, credentials, crypto, io-metrics, madmin, s3-common should depend only on external crates.

    • ⚠️ VIOLATED: utils depends on config, common depends on filemeta and madmin.
  3. Each type has exactly one definition. Types shared across crates must be defined in one crate and re-exported or imported by others.

    • ⚠️ VIOLATED: ReplicationStats (4 copies), LastMinuteLatency (3 copies), BackpressureConfig (3 copies), DataUsageInfo (2 copies).
  4. ecstore does not know about HTTP or S3 protocol details. It operates on storage-level abstractions (objects, buckets, disks, pools).

  5. The rustfs binary crate is the only place that wires everything together. Individual crates should be testable in isolation.

  6. Error types use thiserror with descriptive names (e.g., StorageError, not bare Error).

    • ⚠️ VIOLATED: 6 crates use pub enum Error; 2 crates use snafu; heal use anyhow in library code.

Known Structural Issues

This section documents known problems in the current architecture. It exists so the team can track and address them deliberately.

Critical

  • common/scanner code duplication (~3K lines). scanner depends on common but maintains its own copies of DataUsageInfo, LastMinuteLatency, and related types instead of importing them.

  • ecstore is a monolith (87K lines, 163 files). It contains disk management, bucket management, erasure coding, replication, lifecycle, RPC, and configuration — all in one crate. It should be decomposed along its existing subdirectories.

High

  • Dependency inversions. utils → config and common → filemeta/madmin break the layering model. These need to be untangled.

  • Three-layer BackpressureConfig/DeadlockConfig duplication across io-core, concurrency, and rustfs/storage. Should be defined once with builder/composition.

Medium

  • Inconsistent error handling. Three strategies (thiserror/snafu/anyhow) and mixed naming (bare Error vs descriptive names).

  • Ambiguous common vs utils boundary. Both described as "utilities and data structures." Need clear ownership rules.

Cross-Cutting Concerns

Error Handling

The project convention is thiserror for typed errors with descriptive names. See AGENTS.md: "Prefer thiserror for library-facing error types."

rust
// GOOD
#[derive(Debug, thiserror::Error)]
pub enum StorageError {
    #[error("disk not found: {0}")]
    DiskNotFound(String),
}

// AVOID
pub enum Error { ... }        // too generic
anyhow::Result<T>             // in library code (OK in tests/CLI)

Logging & Tracing

  • Use tracing crate (info!, warn!, error!, debug!, trace!)
  • Structured fields: tracing::info!(bucket = %name, "created bucket")
  • Spans for request-scoped context

Metrics

  • Prometheus-style metrics via rustfs-obs runtime and schema
  • I/O-specific counters via rustfs-io-metrics
  • Registration happens at crate level, collection/reporting in rustfs-obs

Testing

  • Unit tests: #[cfg(test)] mod tests in the same file
  • Integration tests: inside respective crates (not top-level tests/)
  • E2E tests: crates/e2e_test/ — tests against a running server
  • Run all: make test or cargo nextest run

Startup Sequence

The binary (main.rs) boots in this order:

  1. Environment variable compatibility (MINIO_*RUSTFS_*)
  2. Tokio runtime construction
  3. CLI argument parsing
  4. License, observability, TLS, trusted proxies initialization
  5. Config parsing, server address resolution
  6. Credentials, endpoints, local disks, lock client initialization
  7. Capacity management initialization
  8. HTTP server start (S3 API + optional console)
  9. ECStore initialization (erasure coding storage engine)
  10. Global config, background replication, KMS
  11. Optional: FTP/FTPS/WebDAV servers
  12. Event notifier, audit system, deadlock detector
  13. Bucket metadata, IAM, Keystone, OIDC
  14. Scanner and heal manager
  15. Metrics system, mark FullReady
  16. Wait for shutdown signal → graceful shutdown

Dependency Diagram (Simplified)

                            ┌─────────┐
                            │  rustfs │  (binary + lib, 75K lines)
                            │  main   │
                            └────┬────┘
                                 │
                 ┌───────────────┼───────────────┐
                 │               │               │
            ┌────▼────┐    ┌────▼────┐    ┌─────▼─────┐
            │ server  │    │  admin  │    │    app     │
            │ (HTTP)  │    │(console)│    │(use-cases) │
            └────┬────┘    └────┬────┘    └─────┬─────┘
                 │               │               │
                 └───────────────┼───────────────┘
                                 │
                          ┌──────▼──────┐
                          │   storage   │
                          │ (ecfs, SSE, │
                          │  RPC, ACL)  │
                          └──────┬──────┘
                                 │
              ┌──────────────────┼──────────────────┐
              │                  │                  │
        ┌─────▼─────┐    ┌──────▼──────┐    ┌──────▼──────┐
        │  ecstore   │    │     rio     │    │   io-core   │
        │ (87K,core) │    │  (readers)  │    │ (zero-copy) │
        └─────┬──────┘    └─────────────┘    └─────────────┘
              │
    ┌─────┬──┼──┬─────┬──────┐
    │     │  │  │     │      │
 common utils config policy filemeta ...

How to Navigate

  • "Where does S3 PutObject go?" server/ routes → app/object_usecase validates → storage/ecfs encodes → ecstore distributes → rio encrypts/compresses → io-core writes

  • "Where are bucket policies enforced?" app/bucket_usecase calls into crates/policy/

  • "Where is replication configured?" admin/handlers/replication.rs and admin/handlers/site_replication.rs for API, ecstore/src/bucket/replication/ for engine

  • "Where do I add a new admin endpoint?" Add handler in admin/handlers/, register in admin/router.rs

  • "Where do I add a new metric?" Define descriptor/collector in crates/obs/src/metrics/, expose via /minio/v2/metrics


Inspired by matklad's ARCHITECTURE.md and rust-analyzer's architecture.md.