crates/turborepo/ARCHITECTURE.md
This document serves as a sketch of the architecture of the turbo run command
A run consists of the following steps:
crates/turborepo/src/main.rs - Constructs TurboQueryServer (the concrete QueryServer implementation) and passes it to turborepo_lib::maincrates/turborepo-lib/src/commands/run.rs - Entry point for the run command, sets up signal handling and UIcrates/turborepo-lib/src/run/mod.rs - Core run implementationGraceful shutdown and parent-death cleanup are separate responsibilities. Graceful shutdown happens while the Turbo process is still alive, so it should be handled internally by the run and process manager. Parent-death cleanup only applies when Turbo disappears before Rust cleanup code can run.
crates/turborepo-lib/src/commands/run.rs creates a shared
SignalHandler and does not return until all shutdown subscribers finish
their cleanup work.ShutdownReason::Signal)
from close-driven shutdown (ShutdownReason::Close). Normal command
completion uses the close path to drain subscribers without printing
signal-specific shutdown UX.crates/turborepo-lib/src/run/mod.rs registers shutdown subscribers for
task processes, cache writes, and the microfrontends proxy.SIGINT/SIGTERM, Turbo enters graceful shutdown: it prints a
shutdown message, forwards SIGINT to running tasks, and waits for their
process groups to exit.Ctrl+C to
force shut down. Without a terminal on stdin, Turbo instead prints the
remaining time before the automatic force shutdown.Parent-death cleanup is not part of normal graceful shutdown. An in-process map
cannot help after SIGKILL, a crash, or OOM because the map dies with Turbo.
Turbo should not start a per-task Unix watchdog for this case. If abnormal
cleanup is required later, prefer a bounded run-level mechanism:
ProcessManager and shared by
all tasks in the run.pid, pgid, and session identity) when
spawned and unregister on normal exit or Turbo-managed shutdown.prctl(PR_SET_PDEATHSIG) as a best-effort no-helper option, but
it only signals the direct child and cannot provide delayed escalation.Regression coverage for shutdown changes should focus on observable lifecycle behavior:
turbo run signal tests should assert that descendants are not
leaked after force shutdown.crates/turborepo-lib/src/run/builder.rs)Key responsibilities:
--filter)FilterMode (from turborepo-types): when no filter
or only exclude filters are active, root tasks defined in turbo.json are
auto-included. Explicit include filters or --affected suppress root task
injection. See calculate_filtered_packages and FilterMode.Run struct ready for executionWhen the affectedUsingTaskInputs future flag is enabled and --affected is
active, the run builder applies a second filtering pass after engine
construction:
turborepo-types/src/task_input_matching.rs): Each
task's inputs globs are compiled and checked against the changed files.
Shared with turbo query { affectedTasks }.turborepo-lib/src/task_change_detector.rs):
Determines directly affected tasks, handling global deps and per-task inputsEngine::retain_affected_tasks): Returns a new engine
containing directly affected tasks, their transitive dependents, and all
transitive dependencies required for execution (upstream tasks needed as
cache hits)This differs from the default --affected behavior which operates at the
package level (all tasks in changed packages run).
crates/turborepo-repository/src/package_graph/)Represents the workspace structure and package dependencies:
name field
(PackageGraph::validate())The package graph intentionally allows cyclic dependencies between packages —
this aligns with how npm, pnpm, and yarn handle cyclic workspace deps. Cycle
detection is deferred to the task graph layer (engine builder), since
package-level cycles only matter when they produce task-level cycles via
topological (^) dependencies.
crates/turborepo-lib/src/engine/)The task graph is a graph of all tasks that will be part of the run and related configuration.
Due to purely historical reasons, this is referenced as "engine" throughout the codebase.
The core task graph consists of:
crates/turborepo-lib/src/engine/builder.rs)turbo.json and other configuration sources to determine task definitions^build and direct build)crates/turborepo-lib/src/engine/execute.rs)Task Graph Structure:
TaskId (package#task) or rootcrates/turborepo-engine/src/lib.rs)retain_affected_tasks keeps directly affected tasks, transitive dependents,
and all transitive dependencies required for normal --affected executioncreate_engine_for_subgraph is used by watch mode. It keeps changed package
tasks, transitive dependents, and only cacheable upstream dependencies that can
restore outputs without forcing non-cacheable tasks to reruncrates/turborepo-lib/src/task_graph/visitor/)The task graph visitor handles task execution:
visit (crates/turborepo-lib/src/task_graph/visitor/mod.rs)ExecContext for each taskcrates/turborepo-lib/src/task_graph/visitor/exec.rs)ExecContext: Holds state required to execute a taskturborepo_processstdout/sterr outputExecution Flow:
crates/turborepo-lib/src/run/cache.rs and crates/turborepo-cache/)Multi-layered caching system:
RunCache: High-level cache coordinationTaskCache: Individual task cache managementAsyncCache: Handles async cache operations. Supports both local filesystem and remote HTTP cachesSharedHttpClient: Process-wide lazy/activatable reqwest::Client
initialization shared by telemetry and remote-cache consumersNetwork consumers do not construct an HTTP client speculatively at process startup. Instead:
reqwest::ClientThis avoids paying client/TLS setup on invocations with no network use while still warming the client before the first network request in the common case.
turbo run builds SCM state in two stages:
.git/index and records committed blob IDs
plus modified/deleted tracked files for the whole repoThose prefixes are relative to the repo index root, which is usually the Git root. This matters when the Turbo root is nested inside a larger Git repository: the root package should scope to the nested Turbo directory, not request an untracked walk of the entire parent repository.
This keeps the cheap tracked-index work overlapped with other startup work while avoiding a repo-wide untracked walk when only a subset of packages will be hashed.
When running in a Git linked worktree (created via git worktree add), Turborepo automatically shares the local file system cache with the main worktree. This enables:
How it works:
WorktreeInfo::detect() in turborepo-scm determines if the current directory is a linked worktree using Git commands (git rev-parse --show-toplevel and git rev-parse --git-common-dir)ConfigurationOptions::resolve_cache_dir() returns the main worktree's .turbo/cache directory instead of the local oneConfiguration:
cacheDir in turbo.json disables worktree cache sharingCache writes use an atomic write pattern (write-to-temp-then-rename) for concurrent safety:
.{filename}.{pid}.{counter}.tmp)CacheWriter implements Drop to clean up temp files if finish() is not called (e.g., on error or panic)This ensures concurrent readers never see partially written cache files.
crates/turborepo-lib/src/task_hash/)Creates a "content identifier" for a specific task depending on current state of inputs:
inputs, glob matches still walk the
filesystem, but clean tracked matches reuse blob OIDs from the repo index
instead of re-hashing file contents.gitattributes marks files as text or
text=auto, git normalizes CRLF line endings to LF in blob objects. The
crlf module in turborepo-scm replicates this so turbo's file hashes
match git's regardless of the code path (git or manual/no-git after
turbo prune). .gitattributes is included in the global hash inputs
and preserved by turbo prune. Known limitations: only root-level
.gitattributes is loaded; eol= is not handled.globalConfiguration and global.inputsWhen the globalConfiguration future flag is enabled, global.inputs (formerly
globalDependencies) files are not included in the global hash. Instead,
they are prepended as implicit input globs to every task's TaskInputs during
engine construction (see prepend_global_inputs in
crates/turborepo-engine/src/task_definition.rs).
This means:
global.inputs file hashes"inputs": ["$TURBO_DEFAULT$", "!$TURBO_ROOT$/tsconfig.json"])inputs key get default: true set so package files
are still hashed alongside the global inputscapnp to serialize in memory structs for hashingcrates/turborepo-lib/src/run/summary/)The summary module is responsible for any time of summary:
--summarize--dry=jsoncrates/turborepo-lib/src/run/summary/mod.rs)Visitor::visitcrates/turborepo-lib/src/run/summary/execution.rs)--dry=json/--summarizeThe query subsystem powers turbo query (GraphQL introspection of the
package/task graph) and the Web UI mode (--ui=web).
Crate layout:
turborepo-query-api — Trait definitions (QueryServer, QueryRun) and
shared error/result types. turborepo-lib depends on this thin interface
crate instead of the heavy implementation.turborepo-query — GraphQL implementation using async-graphql, axum, and
oxc. Implements the resolvers and HTTP server.turborepo/src/main.rs — Wires the two halves together via TurboQueryServer,
which implements QueryServer by delegating to turborepo-query.Data flow: main() constructs Arc<TurboQueryServer> → passes to
turborepo_lib::main → threaded through shim → cli::run →
commands::run → RunBuilder → Run. The Run struct stores the
query_server and uses it in start_web_ui() and the turbo query
command handler.
RunBuilder
├── Package Discovery → PackageGraph (validates package names)
├── Task Discovery → EngineBuilder
├── Task Graph Construction → Engine (built)
└── Task Graph Validation (cycles, missing deps) → Ready Engine
Process:
dependsOn configurationsEngine.execute()
├── Walker (topological order)
├── Semaphore (concurrency control)
├── Engine -[Task to Run]→ Visitor
└── Engine ←[Task Result]- Visitor
Process:
Walker traverses graph in topological orderVisitorVisitor executes task and reports back to EngineVisitor.visit()
├── Calculate Hash
├── Check Cache → Cache Hit? → Restore & Done
├── Execute Task → Create ExecContext and `exec_context.exec()`
├── Save to Cache
└── Track Results
Process:
TaskCache.restore_outputs()
├── Check caching disabled?
├── Local Cache → exists?
├── Remote Cache → exists?
├── Fetch & Extract
└── Return metadata
TaskCache.save_outputs()
├── Collect output files
├── Compress to tar
├── Save to Local Cache
└── Upload to Remote Cache (async)
crates/turborepo-run-cache/src/incremental.rs)Handles tool-managed incremental artifacts (e.g., .tsbuildinfo) that persist
across runs via remote cache, speeding up cache misses by restoring prior
incremental state before execution.
incrementalTasks future flagspawn_blocking threadsSPEC.md for full specificationOn Cache Miss:
Visitor.visit()
├── Calculate Hash → Cache Miss
├── Fetch Incremental Artifacts (sequential per-partition, must complete before exec)
├── Execute Task
├── Save to Cache
├── Upload Incremental Artifacts (concurrent per-partition, parallel with cache save)
└── Track Results
RunTracker
├── Task Events → ExecutionTracker
├── State Aggregation → SummaryState
├── Summary Generation → RunSummary
└── Output (JSON/Console)
Process:
ExecutionTracker aggregates state across all tasks.turbo/runs/ and optionally printedcrates/turborepo-run-summary/src/observability/ and crates/turborepo-otel/)The observability subsystem enables exporting run metrics to external backends via OpenTelemetry.
The system uses a two-layer design:
turborepo-otel: Low-level OTLP exporter crate
turborepo-run-summary/observability: Integration layer
RunObserver trait for pluggable backendsRunSummary data into metrics payloadsotel feature flagobservability::Handle: Main entry point; wraps backend-specific implementationsRunObserver trait: Abstraction allowing future backends (Prometheus, etc.)OtelObserver: OpenTelemetry implementation of RunObserverObservability is configured via experimentalObservability.otel in turbo.json:
{
"futureFlags": {
"experimentalObservability": true
},
"experimentalObservability": {
"otel": {
"enabled": true,
"protocol": "http/protobuf",
"endpoint": "https://otel-collector.example.com:4318/v1/metrics",
"resource": {
"service.name": "turborepo"
},
"metrics": {
"runSummary": true,
"taskDetails": true,
"runAttributes": {
"id": false, // turbo.run.id — unbounded cardinality
"scmRevision": false // turbo.scm.revision — unbounded cardinality
},
"taskAttributes": {
"id": false, // turbo.task.id
"hashes": false // turbo.task.hash, turbo.task.external_inputs_hash — unbounded
}
}
}
}
}
Configuration can also be set via environment variables (TURBO_EXPERIMENTAL_OTEL_*) or CLI flags (--experimental-otel-*).
turbo.run.duration_ms - Run duration histogramturbo.run.tasks.attempted - Tasks attempted counterturbo.run.tasks.failed - Tasks failed counterturbo.run.tasks.cached - Cache hit counterturbo.task.duration_ms - Per-task duration histogram (when taskDetails enabled)turbo.task.cache.events - Per-task cache events (when taskDetails enabled)Attributes with unbounded cardinality (unique run IDs, Git SHAs, content hashes) are gated behind runAttributes and taskAttributes config flags, all defaulting to false. See the Metric Attributes and Cardinality section in crates/turborepo-otel/src/lib.rs for the full attribute inventory.
RunSummary.finish()
├── observability::Handle.record(&summary)
│ ├── Convert to RunMetricsPayload
│ └── Record via OpenTelemetry instruments
└── observability::Handle.shutdown()
└── Flush pending metrics to backend
crates/turborepo-log/)Structured event system for messages intended for end users (warnings,
errors, informational output). Distinct from tracing, which remains
for developer diagnostics.
Logger — Dispatches events to registered sinks. Set globally via
init() (once, at startup) or used directly via Logger::handle()
for testing.LogHandle — Source-scoped handle for emitting events. Created via
log() (global) or Logger::handle() (specific logger). Resolves
the global logger at .emit() time, not at handle or builder
creation time — handles and builders created before init() work
once the global logger is set.LogSink — Trait for event destinations. Built-in sinks:
CollectorSink (in-memory buffer for post-run summaries) and
FileSink (newline-delimited JSON with optional size limiting).LogEvent — Structured event with level, source, message, typed
fields, and timestamp.turborepo-uiturborepo-ui handles terminal rendering (TUI, console formatting).
turborepo-log handles structured event capture and dispatch. A
terminal sink in turborepo-ui can implement LogSink to bridge
events into the rendering pipeline. turborepo-log intentionally has
no dependency on turborepo-ui — it sits at the bottom of the
dependency graph.
Subsystem / Task Executor
└── LogHandle.warn("msg").field("k", v).emit()
└── Logger.emit(&event)
├── CollectorSink → in-memory buffer → post-run summary
└── FileSink → JSONL file → external tooling