eden/mononoke/docs/6.2-walker-and-validation.md
This document explains the walker tool, which traverses Mononoke's data graph to validate integrity, verify storage durability, and perform analysis operations.
The walker (jobs/walker/) is a graph traversal tool that represents Mononoke's data as a directed graph of nodes connected by edges. It traverses this graph to validate data consistency, verify storage durability across multiple blobstore backends, extract repository data for analysis, and measure compression characteristics.
The walker operates as both a background job for continuous validation and a command-line tool for on-demand operations. It can process entire repositories or work on sampled subsets to handle large-scale repositories within memory constraints.
The walker defines a graph schema that models Mononoke's data structures as nodes with relationships represented by edges. This abstraction spans both blobstore-stored data (Bonsai changesets, file contents, manifests) and SQL-stored data (VCS mappings, bookmark data).
Node Types:
Edge Types: Edges represent relationships between nodes. For example:
The graph is discovered dynamically during traversal. The walker expands each node by loading its data and following outgoing edges to discover connected nodes. This dynamic approach handles Mononoke's large data sets without requiring the entire graph to be loaded into memory.
Cycles: The graph contains cycles due to backlinks from files to changesets (for example, Mercurial filenodes link back to changesets). Visit tracking prevents infinite loops during traversal.
Mutability: Most data represented in the graph is immutable. Exceptions include bookmarks (which are resolved once at the start of a walk) and certain mappings that may be incomplete during derivation. The walker handles these cases by allowing revisits until data reaches a terminal state.
Traversal Depth: A walk is "deep" if it covers the entire repository history by following parent relationships. A walk is "shallow" if it only examines data reachable from specific commits without traversing history.
The walker provides four subcommands, each serving a distinct operational purpose.
The scrub subcommand verifies storage durability in multiplexed blobstore configurations. When Mononoke writes to multiple blobstore backends for redundancy, scrub checks that each component blobstore contains data for every key.
Operation:
Use Cases:
Scrub integrates with the underlying ScrubBlobstore, which provides callbacks when issues are detected. The blobstore healer job (jobs/blobstore_healer/) performs similar healing operations continuously, processing the multiplexed blobstore write-ahead log. Scrub serves as a verification and on-demand repair tool.
The validate subcommand checks data validity by traversing the graph and verifying references. This detects data corruption, missing data, or inconsistencies that might indicate bugs in write or derivation logic.
Validations Performed:
Operation:
The validate operation logs issues to Scuba with route information, allowing operators to trace back from problematic nodes to their sources in the graph.
The corpus subcommand extracts a subset of repository data and writes it to disk in a structured format. This supports offline analysis, compression experiments, and investigation of repository characteristics.
Operation:
Directory Layout:
The corpus is written to a directory structure that separates node types and preserves repository paths:
<output-dir>/
<NodeType>/
<repo-path>/
.mononoke,/
<hash-prefix>/
<blob-key>
For example:
FileContent/root/src/main.rs/.mononoke,/49/f0/c1/repo123.content.blake2.49f0...
The repository path and blob key are percent-encoded to prevent conflicts with the magic .mononoke, directory name (which contains a comma, and commas are percent-encoded). This encoding allows shell operations to distinguish blob structure from repository paths.
Node types without repository paths (such as Bonsai changesets) omit the path component:
BonsaiChangeset/.mononoke,/ab/cd/ef/repo123.changeset.abcdef...
Sampling:
The corpus subcommand supports sampling by node hash or repository path pattern. This allows extracting representative subsets of large repositories for analysis. Sampling can be based on hash ranges (--sample-rate and --sample-offset) or path regex patterns.
The compression-benefit subcommand measures potential storage savings from applying compression to individual blobs. It computes the size of blobs before and after zstd compression to evaluate compression ratios.
Operation:
This analysis helps evaluate storage optimization strategies and understand the compression characteristics of repository data. The results inform decisions about packblob configuration and compression settings.
The walker uses bounded traversal to process the graph efficiently. The bounded_traversal_stream function dynamically unfolds nodes in parallel, expanding edges and queueing child nodes for processing.
To prevent revisiting nodes (which would cause infinite loops in cyclic graphs), the walker maintains concurrent hash maps tracking visited nodes. The WalkStateCHashmap stores node identities and prevents re-expansion of nodes already seen during the current walk.
Visit tracking has memory implications. For large repositories, the set of visited nodes can consume significant memory. The walker is designed to keep node representations small (minimal identity information only) to reduce memory overhead.
The WalkVisitor trait defines callbacks invoked during graph traversal:
start_node(): Called before loading a node's data. This allows setup operations like configuring sampling for blobstore accesses.
visit(): Called after a node is loaded and expanded. This receives the node, its data, and its outgoing edges. Visitors can:
Different subcommands implement different visitors. For example, ScrubVisitor validates storage durability, ValidatingVisitor checks data consistency, and CorpusVisitor extracts blobs to disk.
For large repositories, processing the entire graph may be impractical due to memory constraints or time limits. The walker supports sampling operations on subsets of data.
Sampling Strategies:
Node hash sampling: Select nodes based on their hash values. This provides stable sampling where the same nodes are selected across runs.
Repository path sampling: Select nodes based on repository paths using hash or regex patterns. Path sampling may be non-deterministic when multiple paths reference the same content.
Sampling Implementation:
The SamplingWalkVisitor combines high-level node sampling with low-level blobstore sampling. It sets a SamplingKey in the CoreContext, which the SamplingBlobstore uses to decide whether to sample individual blobstore operations.
The --sample-rate and --sample-offset flags divide the repository into slices. By incrementing the offset, the entire repository can be processed one slice at a time. This allows large-scale operations to be distributed across multiple walker invocations or run incrementally.
Visit tracking state can be maintained across slices to prevent re-visits when processing a repository in batches.
Memory usage is a primary design constraint for the walker. Two factors drive memory consumption:
Visit Tracking: To prevent cycles, the walker must track all visited nodes. For repositories with hundreds of millions of nodes, this requires efficient concurrent hash maps and minimal node representations.
Pending Queue: As the graph expands, the queue of nodes waiting to be processed can grow to millions of entries. The size of the Node enum representation directly affects queue memory usage. Paths associated with nodes (for corpus extraction and routing) also contribute to memory overhead.
Optimization Strategies:
The walker employs several strategies to manage memory:
Node representations minimal (identity information only, without full data)Node (identity) from NodeData (loaded content)Further optimizations under consideration include interning paths and nodes to reduce duplication, returning iterators instead of collected vectors, and extending bounded traversal to adjust queue sizes based on node type characteristics.
The walker provides operational visibility through metrics and logging.
ODS Metrics: The walker publishes counters and timeseries metrics for:
These metrics power dashboards and alerting for continuous validation operations.
Scuba Logging: The walker logs detailed information to Scuba tables:
Scuba logs support debugging and historical analysis of validation runs.
Progress Logging: When running locally or in integration tests, the walker provides progress output via glog, showing traversal statistics and real-time status.
The walker serves multiple operational purposes:
Continuous Validation: Run as a background job to continuously validate repository integrity. This detects data corruption, missing data, or derivation bugs.
Storage Verification: Use scrub mode to verify multiplexed blobstore consistency. Schedule regular scrub runs to detect and repair storage inconsistencies before they impact availability.
Data Analysis: Extract corpus subsets for offline analysis. Experiment with compression techniques, analyze repository structure, or test storage optimizations using real repository data.
Incident Response: Run on-demand validation or scrubbing during incidents to assess impact and verify repairs. The walker can validate specific changesets or time ranges rather than entire repositories.
Capacity Planning: Use compression-benefit analysis to understand storage characteristics and plan capacity. Measure the effectiveness of compression strategies before deploying them.
The walker integrates with Mononoke's storage architecture at multiple levels:
Blobstore Integration: The walker accesses blobs through the standard blobstore interface. This means it benefits from cacheblob caching, respects redaction policies via redactedblobstore, and can sample operations via samplingblob.
Scrub Integration: The scrub subcommand uses ScrubBlobstore, which provides access to individual component blobstores in a multiplex configuration. This allows verification and repair at the component level.
Healer Coordination: While the blobstore healer job continuously processes the write-ahead log, the walker's scrub mode can detect issues that don't appear in the WAL (such as corruption or data loss in a single backend). The two systems are complementary.
Metadata Database: The walker reads VCS mappings, bookmark data, and other metadata from SQL tables. This allows it to validate consistency between blobstore data and database state.
The walker runs using the standard cmdlib/mononoke_app/ framework. It accepts configuration for:
Repository Selection: Specify which repositories to process. Sharded execution can distribute repositories across multiple walker instances.
Traversal Parameters:
Subcommand Options:
Execution Modes:
The walker interacts with several other Mononoke components:
Blobstore Healer (jobs/blobstore_healer/) - Continuously heals storage inconsistencies from the multiplexed blobstore WAL. The walker's scrub complements this by detecting issues outside the WAL.
Admin Tool (tools/admin/) - Provides commands for on-demand validation and data inspection. Some admin subcommands use walker-like traversal internally.
Bounded Traversal (common/bounded_traversal/) - The core traversal primitive used by the walker. This library provides efficient parallel tree/graph traversal with bounded concurrency.
Derived Data System (derived_data/) - The walker validates derived data consistency. It can verify that derived data is correctly computed from Bonsai changesets.
Detailed implementation documentation is available in the walker source:
jobs/walker/src/README.md - Comprehensive design documentation covering graph representation, memory management, sampling, and implementation detailsjobs/walker/src/ - Source code with inline documentationtests/integration/ demonstrate walker usage patternsThe walker is a graph traversal tool that validates Mononoke's data integrity, verifies storage durability, and supports data analysis operations. It models Mononoke's data structures as a dynamically-discovered graph and uses bounded traversal to process the graph efficiently.
The four subcommands—scrub, validate, corpus, and compression-benefit—serve distinct operational purposes from storage verification to data analysis. Sampling support allows the walker to handle large-scale repositories by processing subsets of data, making it applicable to repositories of any size.
The walker operates both as a continuous validation job and as an on-demand tool for incident response and analysis. Its integration with Mononoke's storage architecture and monitoring systems makes it a component of operational data integrity verification.
For storage architecture details, see Storage Architecture. For background job context, see Jobs and Background Workers.