Back to Rustfs

Startup Timeline Baseline

docs/architecture/startup-timeline.md

1.0.0-beta.812.6 KB
Original Source

Startup Timeline Baseline

This document records the current binary startup order before runtime/lifecycle migration work. It is a behavior-preservation baseline only; it does not define new startup semantics.

Scope

  • Baseline commit: ae9d25879d72bc8977f08e61062c022e2142483b
  • Entry points covered: rustfs/src/main.rs::main, async_main, run, and handle_shutdown
  • Related migration task: G-007
  • Out of scope for this baseline: embedded startup, admin route-action matrix, and any runtime/lifecycle code movement

Startup Stages

StepSourceCurrent actionSide effectsFatal boundaryReady stage
BOOT-001rustfs/src/main.rs:84Apply external-prefix environment compatibility before the Tokio runtime is created.Copies supported external env aliases into canonical RUSTFS_* process env keys and prints warnings or info to stderr.Non-fatal; failure is logged to stderr and startup continues.None
BOOT-002rustfs/src/main.rs:89Build the Tokio runtime.Installs runtime configuration and any runtime telemetry guard created by the runtime builder.Fatal through expect; process exits if the runtime cannot be built.None
BOOT-003rustfs/src/main.rs:139Parse CLI command and dispatch non-server commands.info and tls commands execute and return without server startup.Command parse exits process with code 1; TLS command errors propagate.None
BOOT-004rustfs/src/main.rs:167Initialize config snapshot and license state.Publishes config snapshot for later readers and initializes runtime license state.License init is non-fallible in this path.None
BOOT-005rustfs/src/main.rs:173Initialize observability and store the global guard.Initializes tracing/observability, stores the guard globally, and logs license/runtime telemetry status.Fatal if observability init or guard publication fails.None
BOOT-006rustfs/src/main.rs:208Log startup logo, initialize profiling, trusted proxies, rustls provider, and outbound TLS material.Starts optional profiling tasks, trusted proxy config, default rustls provider, outbound TLS global state, TLS generation metric, and TLS metrics when enabled.Profiling/proxy/provider setup is non-fatal; configured TLS material load is fatal on error.None
RUN-001rustfs/src/main.rs:256Enter run and create GlobalReadiness.Allocates the readiness tracker shared with HTTP readiness gates.Non-fatal.Initial readiness state is not ready
RUN-002rustfs/src/main.rs:261Parse and publish the configured region.Updates ECStore global region when configured.Fatal if the configured region is invalid.None
RUN-003rustfs/src/main.rs:268Resolve server address and warn on default credentials.Computes server port/address and emits production credential warning when defaults are used.Address parse is fatal; default credentials warning is non-fatal.None
RUN-004rustfs/src/main.rs:286Initialize global action credentials.Publishes root/action credentials used by auth paths.Fatal if global credentials cannot be initialized.None
RUN-005rustfs/src/main.rs:298Publish server port and address.Updates global RustFS port and global address.Non-fatal in this path.None
RUN-006rustfs/src/main.rs:302Build endpoint pools and enforce unsupported filesystem policy.Derives pool/set/disk layout from configured volumes and validates unsupported filesystem policy.Fatal on endpoint build or unsupported filesystem policy error.None
RUN-007rustfs/src/main.rs:308Publish endpoints and erasure type.Updates global endpoints and erasure type.Non-fatal in this path.None
RUN-008rustfs/src/main.rs:311Initialize local disks, prewarm local disk id map, and initialize lock clients.Opens local disk state, primes disk id lookup, and creates global lock clients.Local disk init is fatal; prewarm and lock-client setup are non-fatal in this path.None
RUN-009rustfs/src/main.rs:350Initialize capacity management and service state manager.Starts capacity management and moves service state to Starting.Non-fatal in this path.None
RUN-010rustfs/src/main.rs:356Start S3 HTTP listener and optional console listener before storage is ready.Starts HTTP servers with readiness gates; console listener starts only when enabled and configured.Fatal if a configured listener cannot start.Requests remain gated until full readiness except probe/admin/console/rpc/tonic/table-catalog exempt paths
RUN-011rustfs/src/main.rs:372Create cancellation token and initialize ECStore.Creates the runtime cancellation token and storage engine.Fatal if ECStore::new fails.None
RUN-012rustfs/src/main.rs:382Initialize ECStore config and global config system.Initializes ECStore config, attempts server-config migration, then retries global config init up to 15 times.Migration attempt is non-fatal in this path; global config init becomes fatal after retries.Marks the GlobalReadiness StorageReady stage after global config init succeeds; later runtime readiness still rechecks storage, IAM, and lock quorum before FullReady
RUN-013rustfs/src/main.rs:397Start replication and KMS systems.Starts background replication pool and initializes KMS.Replication init is non-fatal in this path; KMS init is fatal on error.StorageReady stage is already marked; dynamic runtime storage readiness is still checked before FullReady
RUN-014rustfs/src/main.rs:402Initialize optional protocol servers.Starts FTP/FTPS/WebDAV/SFTP when feature-enabled and configured, collecting shutdown handles.Feature-enabled protocol init is fatal on error; disabled protocols are non-fatal.None
RUN-015rustfs/src/main.rs:482Initialize buffer profiling, event notifier, audit, and deadlock detector.Starts buffer profile system, event notifier, audit system, and optional deadlock detector.Audit startup failure is logged and non-fatal; the others are non-fatal in this path.None
RUN-016rustfs/src/main.rs:503List buckets and run bucket/replication/IAM metadata migrations.Reads bucket names, migrates bucket metadata, initializes replication resync, migrates IAM config, and initializes bucket metadata system.Bucket list and replication resync are fatal on error; metadata migration calls are non-fatal in this path.Storage remains ready; IAM not yet ready
RUN-017rustfs/src/main.rs:523Bootstrap IAM inline or defer recovery.Initializes IAM when possible; otherwise starts the deferred IAM recovery path through startup_iam.Fatal only when bootstrap_or_defer_iam_init returns an unrecoverable error.Inline success marks IamReady; deferred mode publishes IamReady later from the recovery task
RUN-018rustfs/src/main.rs:535Initialize Keystone and OIDC auth integrations.Loads Keystone env config and initializes OIDC providers.Keystone config parse is fatal; Keystone runtime init failure is non-fatal; OIDC init failure is non-fatal.None
RUN-019rustfs/src/main.rs:552Add bucket notification config and initialize notification system.Adds bucket notification configuration and publishes the global notification system.Notification config add is non-fatal in this path; global notification init is fatal on error.None
RUN-020rustfs/src/main.rs:560Create AHM cancellation token and initialize heal manager when scanner or heal is enabled.Creates AHM cancellation token and starts heal manager for heal/scanner workflows.Heal manager init is fatal when enabled.None
RUN-021rustfs/src/main.rs:584Print server info, init update check, allocator reclaim, metrics, memory observability, and auto-tuner.Starts informational/update/memory/metrics background tasks when enabled.Non-fatal in this path.None
RUN-022rustfs/src/main.rs:599Log successful startup and publish full readiness for inline IAM.Logs version/address, checks runtime readiness, marks FullReady, and sets service state to Ready when IAM was ready inline.Fatal if runtime readiness is not reached within the startup wait.Marks FullReady only for inline IAM here
RUN-023rustfs/src/main.rs:609Publish global init time and start data scanner when enabled.Sets global init time and starts scanner after the successful-startup log.Scanner start is non-fatal in this path.Full readiness may already be published or may await deferred IAM recovery
RUN-024rustfs/src/main.rs:616Wait for shutdown signal.Blocks the main task until a shutdown signal is received.Non-fatal.Runtime remains in its current readiness state

Deferred IAM Readiness

StepSourceCurrent actionSide effectsFatal boundaryReady stage
IAM-001rustfs/src/startup_iam.rs:256Attempt init_iam_sys during bootstrap.Initializes IAM against the ECStore object layer when possible.Recoverable failures can enter deferred mode; unrecoverable errors propagate from bootstrap.None
IAM-002rustfs/src/startup_iam.rs:73Spawn IAM recovery loop when bootstrap is deferred.Retries IAM initialization with exponential backoff until shutdown or success.Retry failures are logged; the service remains degraded.None
IAM-003rustfs/src/startup_iam.rs:52Finalize IAM recovery after init succeeds.Initializes AppContext if needed, marks IamReady, and calls runtime readiness publication.Finalize failures are retried by the recovery loop.Marks IamReady, then FullReady when runtime readiness succeeds

Readiness Gate

StepSourceCurrent actionSide effectsFatal boundaryReady stage
READY-001rustfs/src/server/readiness.rs:130Treat exact probe paths and admin/console/rpc/tonic/table-catalog prefixes as readiness-gate bypass paths.Bypass paths continue to the inner service while the global readiness gate is not ready.Non-fatal.Does not change readiness stages
READY-002rustfs/src/server/readiness.rs:171Reject non-probe requests while GlobalReadiness is not ready.Returns 503 Service Unavailable, Retry-After: 5, Content-Type: text/plain; charset=utf-8, and Cache-Control: no-store.Non-fatal.Does not change readiness stages
READY-003rustfs/src/server/readiness.rs:202Wait for runtime storage, IAM, and lock quorum readiness before publishing ready state.Marks FullReady and updates ServiceState to Ready only when a state manager is provided.Returns an error on timeout; inline startup treats that as fatal, while deferred IAM recovery retries finalization.Marks FullReady

Shutdown Order

StepSourceCurrent actionSide effectsFatal boundaryReady stage
STOP-001rustfs/src/main.rs:647Cancel runtime token and move service state to Stopping.Notifies cancellation-aware background tasks.Non-fatal.Service state moves to Stopping; readiness stages are not cleared here
STOP-002rustfs/src/main.rs:675Stop scanner/background services and AHM services according to enable flags.Calls ECStore background shutdown and heal/scanner shutdown helpers.Non-fatal in this path.No readiness-stage change
STOP-003rustfs/src/main.rs:699Signal optional FTP/FTPS/WebDAV/SFTP protocol servers.Collects protocol shutdown futures.Non-fatal in this path.No readiness-stage change
STOP-004rustfs/src/main.rs:735Stop event notifier, audit system, and profiling tasks.Stops notifier and profiling tasks; audit stop failures are logged.Non-fatal in this path.No readiness-stage change
STOP-005rustfs/src/main.rs:763Stop S3 and console HTTP servers, wait for protocol shutdowns, then mark service state Stopped.HTTP shutdown happens after notifier/audit/profiling shutdown in current order.Join failures are logged by shutdown handles; this path does not return errors.Service state moves to Stopped; readiness stages are not cleared here

Migration Rules

  • Runtime/lifecycle PRs must map each moved startup line back to one of the BOOT-*, RUN-*, IAM-*, READY-*, or STOP-* rows.
  • A pure-move PR must keep the fatal boundary and ready-stage column unchanged.
  • Any intentional change to this table is a separate behavior-change PR with focused negative tests.
  • Do not use this document to justify changing readiness, IAM recovery, HTTP listener timing, lock quorum, or shutdown order in a docs-only PR.