Back to Sapling

Repository Facets

eden/mononoke/docs/2.2-repository-facets.md

latest14.7 KB
Original Source

Repository Facets

This document explains the facet pattern and how repositories are composed in Mononoke. Facets are trait-based components that provide specific repository capabilities. Understanding facets is essential for working with Mononoke, as they form the foundation of how code accesses repository functionality.

What Are Facets?

A facet is a trait-based component that provides a specific capability for a repository. Rather than having a single repository class with all possible methods, Mononoke breaks repository functionality into discrete facets that can be composed together. Each facet encapsulates a single responsibility and may contain state that forms part of the repository.

For example, RepoIdentity provides repository name and ID, RepoBlobstore provides access to immutable blob storage, and CommitGraph provides commit graph traversal operations. Functions declare their requirements explicitly by specifying which facets they need through trait bounds.

Facets are defined in the repo_attributes/ directory and are used throughout Mononoke's codebase as the standard way to access repository functionality. As explained in the Architecture Overview, facets form the composition layer between base components (storage, types, utilities) and higher-level features (pushrebase, cross-repo sync).

Defining Facets

Facets are defined using the #[facet::facet] macro, which generates the necessary infrastructure for the facet pattern. Here's a simple example from repo_attributes/repo_identity/src/lib.rs:

rust
#[facet::facet]
#[derive(Clone, Hash, PartialEq, Eq)]
pub struct RepoIdentity {
    id: RepositoryId,
    name: String,
}

impl RepoIdentity {
    pub fn id(&self) -> RepositoryId {
        self.id
    }

    pub fn name(&self) -> &str {
        &self.name
    }
}

The #[facet::facet] macro generates:

  • Accessor traits (e.g., RepoIdentityRef, RepoIdentityArc) for accessing the facet from generic containers
  • Type aliases for Arc-wrapped facets (e.g., ArcRepoIdentity)
  • Integration with the facet container system

Facets can hold state (like configuration, cached data, or storage handles) or simply provide access to underlying systems. More complex facets like CommitGraph and RepoBlobstore wrap storage backends and provide domain-specific operations.

Using Facets in Code

Functions declare their facet requirements using trait bounds. This makes dependencies explicit and allows the compiler to verify that the necessary capabilities are available. Here's an example:

rust
use commit_graph::CommitGraph;
use repo_blobstore::RepoBlobstore;
use repo_identity::RepoIdentity;

async fn example_operation(
    ctx: &CoreContext,
    repo: &(impl CommitGraph + RepoBlobstore + RepoIdentity),
    cs_id: ChangesetId,
) -> Result<()> {
    // Access facets through generated accessor methods
    let repo_name = repo.repo_identity().name();
    let blobstore = repo.repo_blobstore();
    let parents = repo.commit_graph()
        .changeset_parents(ctx, cs_id)
        .await?;

    // Function only has access to the three declared facets
    Ok(())
}

This pattern has several characteristics:

Explicit Dependencies - The function signature declares exactly which repository capabilities are required. A function needing only identity and blobstore access cannot accidentally depend on commit graph operations.

Composability - Different repository types can provide different combinations of facets. Test repositories can provide mock implementations of specific facets.

Compile-Time Verification - If a function attempts to access a facet not declared in its trait bounds, the code will not compile.

For functions requiring many facets, trait aliases can reduce verbosity:

rust
pub trait MyOperationRepo = CommitGraph
    + RepoBlobstore
    + RepoIdentity
    + RepoDerivedData;

async fn complex_operation(
    ctx: &CoreContext,
    repo: &impl MyOperationRepo,
) -> Result<()> {
    // Function can use all four facets
}

Available Facets

Mononoke provides approximately 35 facets, organized into several functional categories. Each facet is implemented in its own directory under repo_attributes/ or related top-level directories.

Identity and Configuration

RepoIdentity (repo_attributes/repo_identity/)

  • Provides repository ID and name
  • Required by nearly all repository operations

RepoBookmarkAttrs (repo_attributes/repo_bookmark_attrs/)

  • Bookmark-specific configuration attributes
  • Defines policies for bookmark operations

Storage Access

RepoBlobstore (repo_attributes/repo_blobstore/)

  • Access to the repository's immutable blobstore
  • Wrapped with prefix and redaction layers specific to the repository

Filestore (repo_attributes/filestore/)

  • File content storage and retrieval
  • Handles file chunking for large files

MutableBlobstore (repo_attributes/mutable_blobstore/)

  • Mutable blob operations
  • Used for temporary or mutable repository data

Commit Graph and History

CommitGraph (repo_attributes/commit_graph/commit_graph/)

  • Efficient commit graph storage and traversal
  • Provides ancestry queries, parent/child relationships, graph traversal

Phases (repo_attributes/phases/)

  • Commit phase tracking (draft, public)
  • Important for Mercurial/Sapling semantics

Derived Data

RepoDerivedData (repo_attributes/repo_derived_data/)

  • Access to the derived data manager for this repository
  • Coordinates derivation and fetching of all derived data types

RepoDerivationQueues (repo_attributes/repo_derivation_queues/)

  • Queues for coordinating remote derivation
  • Manages work distribution across derivation workers

VCS Mappings

These facets map between Bonsai (Mononoke's internal format) and external VCS identities:

BonsaiHgMapping (repo_attributes/bonsai_hg_mapping/)

  • Bonsai ↔ Mercurial changeset ID mapping
  • Essential for Mercurial/Sapling client support

BonsaiGitMapping (repo_attributes/bonsai_git_mapping/)

  • Bonsai ↔ Git commit SHA mapping
  • Required for Git protocol support

BonsaiGlobalrevMapping (repo_attributes/bonsai_globalrev_mapping/)

  • Bonsai ↔ GlobalRev (sequential integer) mapping
  • Provides SVN-style sequential commit identifiers

BonsaiSvnrevMapping (repo_attributes/bonsai_svnrev_mapping/)

  • Bonsai ↔ SVN revision mapping
  • Used for repositories imported from Subversion

BonsaiBlobMapping (repo_attributes/bonsai_blob_mapping/)

  • Maps bonsai changeset IDs to all the blobs that were introduced by that changeset
  • Used for enumerating blobs when deleting changesets

BonsaiTagMapping (repo_attributes/bonsai_tag_mapping/)

  • Maps Git annotated tag objects

Bookmarks and References

Bookmarks (repo_attributes/bookmarks/bookmarks/)

  • Read and write access to bookmarks (branch pointers)

BookmarkUpdateLog (within repo_attributes/bookmarks/)

  • History of all bookmark movements of non-scratch bookmarks
  • Used for synchronization and auditing

BookmarksCache (within repo_attributes/bookmarks/)

  • Cached bookmark information
  • Reduces database load for frequently accessed bookmarks

Git-Specific Facets

GitSymbolicRefs (repo_attributes/git_symbolic_refs/)

  • Git symbolic references (like HEAD → refs/heads/main)
  • Required for proper Git protocol support

GitRefContentMapping (repo_attributes/git_ref_content_mapping/)

  • Maps Git references to their content
  • Handles Git tag objects and other ref types

GitSourceOfTruth (repo_attributes/git_source_of_truth/)

  • Tracks which system is the authoritative source for Git data
  • Used during migration from MetaGit to Mononoke

File and Data Management

Filenodes (repo_attributes/filenodes/)

  • Legacy file history information for Mercurial
  • Maps files to changesets that modified them (Mercurial linknodes)

Newfilenodes (repo_attributes/newfilenodes/)

  • Updated filenode implementation

MutableCounters (repo_attributes/mutable_counters/)

  • Integer counters for tracking operation progress
  • Used by sync jobs and other background processes

MutableRenames (repo_attributes/mutable_renames/)

  • Allows dynamic modification of historical file rename information
  • Used to fix up errors in file history

Operational Facets

RepoPermissionChecker (repo_attributes/repo_permission_checker/)

  • Permission and ACL verification
  • Controls read and write access to repository data

HookManager (repo_attributes/hook_manager/)

  • Repository hook execution
  • Enforces pre-land bookmark policies and validations

RepoCrossRepo (repo_attributes/repo_cross_repo/)

  • Cross-repository sync operations
  • Enables repository-to-repository synchronization (megarepo)

RepoLock (repo_attributes/repo_lock/)

  • Repository-level locking

RepoSparseProfiles (repo_attributes/repo_sparse_profiles/)

  • Sparse checkout profile size tracking

RestrictedPaths (repo_attributes/restricted_paths/)

  • Path-level access restrictions

Metadata and Events

PushrebaseMutationMapping (repo_attributes/pushrebase_mutation_mapping/)

  • Tracks commit rewrites during pushrebase
  • Maps original commits to their rewritten versions

DeletionLog (repo_attributes/deletion_log/)

  • Part of draft commit deletion

RepoMetadataCheckpoint (repo_attributes/repo_metadata_checkpoint/)

  • Checkpoint for the repo metadata logger

RepoEventPublisher (repo_attributes/repo_event_publisher/)

  • Manages subscriptions to repository events (bookmark updates)

SqlQueryConfig (repo_attributes/sql_query_config/)

  • Configuration for SQL query behavior
  • Controls timeouts, retry policies, and other SQL parameters

Facet Construction and Composition

Repository Factory

The repo_factory/ directory contains the code responsible for constructing repositories from configuration. The factory:

  1. Reads repository configuration from the configuration system
  2. Constructs storage backends (blobstore, SQL databases)
  3. Creates each facet with its dependencies
  4. Assembles facets into a facet container

The resulting repository object implements all the accessor traits, allowing code to access any facet through the appropriate trait bound.

Test Repositories

Test code uses TestRepoFactory to create repositories with test-appropriate backends. For example:

rust
let factory = TestRepoFactory::new()?;
let repo = factory
    .with_name("test_repo")
    .with_id(RepositoryId::new(1))
    .build()
    .await?;

Test repositories typically use in-memory storage (memblob, SQLite) rather than production backends. Individual facets can be mocked or replaced for testing specific behaviors.

Relationship to Features

As described in the Architecture Overview, features sit one layer above facets. Features are implemented as functions that combine multiple facets to provide high-level source control operations. Unlike facets, features do not hold state—they orchestrate operations across repository attributes.

For example, the pushrebase feature (in features/pushrebase/) combines facets like CommitGraph, Bookmarks, RepoBlobstore, HookManager, and RepoDerivedData to implement server-side rebasing. The feature function declares its facet requirements through trait bounds and uses those facets to perform the operation.

This separation allows features to be reused across different repository types and contexts. A pushrebase function can be called from the main server (via the API layer), from the land service (directly), or from the admin tool (for testing), all using the same implementation.

Facet Design Principles

Several patterns are consistent across facet implementations:

Single Responsibility - Each facet encapsulates one aspect of repository functionality. RepoIdentity provides identity, CommitGraph provides graph operations, and so on.

State Encapsulation - Facets can hold state (configuration, storage handles, caches) but expose this state through well-defined methods rather than public fields.

Storage Abstraction - Facets abstract over storage details. CommitGraph can use SQL storage, in-memory storage, or cached storage without changing its interface.

Async Operations - Operations that perform I/O are async, allowing concurrent access and efficient resource usage.

Arc-Wrapped Types - Facets are typically used through Arc (atomic reference counting) to enable shared ownership across async operations. The macro generates Arc* type aliases for convenience.

Finding Facet Implementations

All repository attribute facets are located in repo_attributes/. Each facet has its own subdirectory containing:

  • src/ - Rust source code
  • BUCK - Build definitions
  • Optional: SQL storage implementation in a subdirectory (e.g., sql_bookmarks/)

The Navigating the Codebase guide provides additional detail on finding and understanding facet implementations.

Patterns for Working with Facets

When writing code that uses repositories:

Declare Minimal Requirements - Only require the facets actually needed. A function that only reads repository identity should require impl RepoIdentity, not all possible facets.

Use Trait Aliases for Complex Requirements - If many functions need the same set of facets, define a trait alias to reduce duplication.

Avoid Storing Facets Directly - Use Arc wrappers when facets need to be stored. The generated Arc* type aliases make this straightforward.

Access Through Trait Methods - Use the generated accessor methods (e.g., repo.repo_identity()) rather than attempting to downcast or access internals.

When adding new facets:

Follow the Pattern - Use #[facet::facet] macro and follow existing facet structure.

Place in repo_attributes - New facets should be subdirectories of repo_attributes/.

Document the Interface - Add rustdoc comments explaining what the facet provides.

Register with Factory - Update repo_factory/ to construct the new facet when creating repositories.

Next Steps

This document covered the facet pattern and how repositories are composed from facets. To dive deeper:

The facet pattern is used consistently throughout Mononoke. Understanding how to declare facet requirements and access facet functionality is fundamental to working with the codebase.