Back to Sapling

Mononoke Documentation

eden/mononoke/docs/README.md

latest7.8 KB
Original Source

Mononoke Documentation

About Mononoke

Mononoke is a scalable source control server designed to handle massive repositories with millions of files, thousands of commits per day, and hundreds of thousands of branches. As the server-side component of Meta's Sapling source control ecosystem, Mononoke works together with EdenFS (a virtual filesystem) and the Sapling CLI to provide version control operations at large scale.

Unlike traditional version control servers that are tied to a specific VCS, Mononoke uses a canonical, VCS-agnostic data model called Bonsai as its single source of truth. This enables Mononoke to serve both Sapling and Git clients from the same backend, converting between formats as needed while maintaining consistency. The architecture separates write operations from read operations, computes expensive indexes asynchronously, and uses horizontal scaling to handle load.

For a complete introduction, see What is Mononoke?.

About This Documentation

This documentation provides high-level architectural and conceptual information about Mononoke for developers, operators, and AI agents. It covers core architectural patterns, major components (servers, jobs, tools), key features, VCS integration, and operational aspects.

Not covered here: Build/test workflows are in CLAUDE.md. API documentation is in rustdoc comments. Detailed operational procedures are in component-specific directories.

Getting Started

New to Mononoke? Start with Section 1 (Introduction).

Building or developing? See CLAUDE.md for build instructions and testing procedures.

Looking for specific topics? Use the table of contents below.

Documentation Contents

1. Introduction

Core concepts and orientation for understanding Mononoke

1.1 - What is Mononoke? Mononoke's role as a scalable source control server, its place in the Sapling ecosystem, design goals, and deployment scenarios.

1.2 - Key Concepts Essential concepts: Bonsai (canonical data model), content addressing (hash-based storage), repository facets (trait-based composition), derived data (asynchronous indexes), blobstore (immutable storage), metadata database, and VCS mappings.

1.3 - Architecture Overview System architecture (how services are composed) and code architecture (how applications are structured). Covers data flow, write vs. read paths, and how components fit together.

1.4 - Navigating the Codebase Guide to finding components in 70+ directories. Covers directory organization, BUCK files, and how to locate implementations and tests.

2. Architecture

Deep dive into Mononoke's data model, patterns, and storage design

2.1 - Bonsai Data Model Mononoke's core data model: what Bonsai is, how it represents commits and file changes, content addressing, and how it enables multi-VCS support.

2.2 - Repository Facets The facet pattern used throughout Mononoke. Covers major facet categories (identity, storage, commit graph, derived data, VCS mappings, bookmarks, operations) and how features compose facets.

2.3 - Derived Data The derived data framework: what derived data is, why it's computed off the write path, the derivation process, major types, and remote derivation.

2.4 - Storage Architecture How Mononoke stores and caches data. Covers blobstore architecture (backends, decorator pattern), metadata database, caching strategy, and packblob compression.

3. Components

Servers, background jobs, command-line tools, and shared libraries

3.1 - Servers and Services Main protocol servers (Mononoke/SLAPI, SCS, Git, LFS) and internal microservices (Land, Derived Data, Bookmark, Diff, Load Limiter).

3.2 - Jobs and Background Workers Background maintenance tasks: Walker (graph validation), Blobstore Healer (storage durability), Derivation Worker, Cross-Repo Sync, and Statistics Collector.

3.3 - Tools and Utilities Command-line tools: admin CLI (primary tool), import/export tools (blobimport, gitimport), verification tools (aliasverify), and maintenance utilities (packer, sqlblob_gc).

3.4 - Libraries and Frameworks Shared libraries: cmdlib/mononoke_app framework for new binaries, common utilities (async, SQL, logging), core types, and testing utilities.

4. Features

Key source control operations and workflows

4.1 - Pushrebase Server-side rebasing for maintaining linear history at scale. Covers the pushrebase process, conflict detection, hooks integration, and use in Sapling and Git workflows.

4.2 - Cross-Repo Sync Repository synchronization between large and small repos. Covers sync patterns, commit transformation, and sync job operation.

4.3 - Hooks Policy enforcement at push time. Covers hook types (bookmark, changeset, file), configuration, execution, and the hook manager facet.

4.4 - Redaction Content redaction for removing sensitive data. Covers how redaction works at the blobstore level and access control.

4.5 - Microwave Cache warming for improved performance. Covers what gets warmed (derived data, blobstore) and when warming happens.

5. VCS Integration

Support for Git, Mercurial, and Sapling clients

5.1 - Git Support Git protocol server, Bonsai ↔ Git conversion, Git-specific derived data, reference handling, LFS integration, and source of truth tracking.

5.2 - Mercurial and Sapling Support Historical context, Bonsai ↔ Mercurial conversion, wire protocol, Mercurial-specific derived data, EdenAPI protocol, and compatibility requirements.

6. Operations

Operational concerns: performance, validation, and observability

6.1 - Rate Limiting and Load Shedding Load management strategies, the Load Limiter service, QPS limits, and load shedding under pressure.

6.2 - Walker and Validation Graph traversal and validation tool. Covers scrubbing (validation and repair), corpus generation, and compression analysis.

6.3 - Monitoring and Observability Metrics (ODS/stats), logging (Scuba), tracing, dashboards, health checks, and performance indicators.

Appendix A. Future Improvements

A.1 - Better Engineering A living document tracking long-term engineering improvements and technical debt that could be addressed in the codebase.

Additional Resources

Development:

  • CLAUDE.md - Build commands, tests, build modes, development patterns

Code:

Related Projects:

  • EdenFS: fbcode/eden/fs/ - Virtual filesystem
  • Sapling CLI: fbcode/eden/scm/ - Command-line interface
  • Eden project docs: fbcode/eden/CLAUDE.md

External:

Contributing

Update these docs for architectural changes, new components, or component reorganization. Do not update for implementation details, bug fixes, or configuration changes—those belong in component directories, rustdoc, or CLAUDE.md.

Maintain a professional, neutral, factual tone. Keep content high-level, cross-reference liberally, and verify accuracy against the code.


For questions, contact the scm_server_infra oncall.