Back to Sapling

Tools and Utilities

eden/mononoke/docs/3.3-tools-and-utilities.md

latest15.7 KB
Original Source

Tools and Utilities

This document describes the command-line tools and utilities available for operating and maintaining Mononoke. These tools provide operational capabilities for administrators, debugging interfaces for developers, and utilities for data operations.

Overview

Mononoke provides a collection of command-line tools for various operational tasks. These tools are distinct from servers (which handle client requests) and jobs (which run as long-lived background processes). Tools are invoked on-demand by operators or administrators to perform specific tasks such as importing repositories, verifying data integrity, or performing maintenance operations.

The primary administrative interface is the admin tool, which provides a comprehensive set of subcommands for most operational tasks. Additional specialized tools handle specific operations like repository import, data verification, and storage maintenance.

Tool vs Job vs Server

Understanding the distinction between these three types of applications is important for navigating the codebase and choosing the appropriate component for a task:

Tools (in tools/)

  • Command-line utilities invoked on-demand
  • Execute a specific task and exit
  • Examples: admin, blobimport, aliasverify
  • Used by operators and administrators for operational tasks
  • Can run in production or development environments

Jobs (in jobs/)

  • Long-running background processes
  • Continuously perform maintenance or monitoring tasks
  • Examples: walker, blobstore_healer, statistics_collector
  • Typically scheduled and monitored as services
  • Described in detail in Jobs and Background Workers

Servers (various top-level directories)

  • Handle client requests over network protocols
  • Run continuously as services
  • Examples: Mononoke server, Git server, SCS server
  • Described in detail in Servers and Services

This document focuses on tools—the command-line utilities used for operational and administrative tasks.

Administrative Tools

Admin CLI

Location: tools/admin/

Binary: fbcode//eden/mononoke/tools/admin:admin

The admin tool is the primary command-line interface for Mononoke operators. It provides a unified interface for inspecting and modifying repository state, managing derived data, performing data operations, and debugging issues.

The admin tool is organized into subcommands, each handling a specific aspect of repository administration. The tool includes over 140 command files implementing various operations.

Major subcommand categories:

Repository Operations

  • bookmarks - Bookmark (branch) management and inspection
  • changelog - Changeset log operations
  • commit - Commit inspection and operations
  • fetch - Fetch files and changesets

Derived Data

  • derived-data - Derived data management, derivation, and inspection
  • derivation-queue - Derivation queue operations
  • blame - Blame data inspection
  • case-conflict - Case conflict detection

Storage Operations

  • blobstore - Blobstore inspection and operations
  • raw-blobstore - Direct blobstore access
  • ephemeral-store - Ephemeral storage management
  • filestore - File storage operations

Commit Graph

  • commit-graph - Commit graph operations and inspection

Cross-Repository Operations

  • cross-repo - Cross-repository sync operations
  • cross-repo-config - Cross-repo configuration management
  • megarepo - Megarepo (monorepo) operations

Git-Specific Operations

  • git-bundle - Git bundle operations
  • git-symref - Git symbolic reference management
  • git-content-ref - Git content reference operations
  • git-objects - Git object inspection
  • git-cgdm-updater - Git commit graph delta manifest updates
  • git-cgdm-components - Git CGDM component inspection

Other Operations

  • mutable-counters - Mutable counter management
  • mutable-renames - Mutable rename tracking
  • phases - Commit phase management
  • redaction - Content redaction operations
  • locking - Repository locking
  • async-requests - Asynchronous request management
  • cas-store - Content-addressed storage operations
  • repo-info - Repository information
  • list-repos - List configured repositories
  • convert - Conversion utilities
  • slow-bookmark-mover - Controlled bookmark movement

The admin tool is the recommended interface for most operational tasks. Operators should use this tool rather than directly manipulating storage or database state.

Testtool

Location: tools/testtool/

Binary: fbcode//eden/mononoke/tools/testtool:testtool

The testtool provides utilities for testing and debugging Mononoke in non-production environments. This tool cannot be run against production repositories—it validates the environment and refuses to execute if production configuration is detected.

The testtool is used during development and testing to perform operations that would be inappropriate for production environments. It provides debugging capabilities and test utilities not available in the admin tool.

Import and Export Tools

Blobimport

Location: tools/blobimport/

Binary: fbcode//eden/mononoke/tools/blobimport:blobimport

Blobimport imports Mercurial repositories into Mononoke's blobstore format. The tool reads Mercurial revlog data and converts it to Bonsai changesets and associated data structures.

The import process:

  1. Reads Mercurial changelog and manifest data
  2. Converts changesets to Bonsai format
  3. Stores file contents in the blobstore
  4. Creates VCS mappings (Bonsai ↔ Mercurial)
  5. Optionally imports bookmarks
  6. Optionally derives specified data types

Blobimport supports various options for controlling the import process, including parent order fixes, bookmark policies, and derived data derivation.

Gitimport

Location: git/gitimport/

Binary: fbcode//eden/mononoke/git/gitimport:gitimport

Gitimport imports Git repositories into Mononoke. The tool reads Git objects and converts them to Bonsai changesets while preserving Git-specific metadata.

The import process:

  1. Reads Git objects from a Git repository
  2. Converts commits to Bonsai format
  3. Creates Git-specific derived data (Git trees, commits)
  4. Establishes VCS mappings (Bonsai ↔ Git)
  5. Handles Git references and symbolic refs
  6. Optionally processes Git LFS objects

Gitimport supports incremental imports, allowing repositories to be updated with new commits after the initial import.

Repo Import

Location: tools/repo_import/

Binary: fbcode//eden/mononoke/tools/repo_import:repo_import

The repo_import tool handles repository import with additional capabilities beyond basic blobimport or gitimport. It supports:

  • Cross-repository sync during import
  • Commit transformation and rewriting
  • Backsyncing operations
  • Large repository imports with specialized handling

This tool is used for complex import scenarios where simple blobimport or gitimport is insufficient.

Streaming Clone

Location: tools/streaming_clone/

Binary: fbcode//eden/mononoke/tools/streaming_clone:streaming_clone

The streaming_clone tool manages streaming clone chunks for Mercurial clients. Streaming clone allows clients to quickly obtain a repository's changelog by downloading pre-generated chunks rather than fetching individual changesets.

The tool provides subcommands to:

  • Create new streaming clone chunks from Mercurial revlog data
  • Update existing streaming clone data
  • Manage chunk storage in the blobstore

Streaming clone chunks are stored in Mononoke's mutable blobstore and served to compatible Mercurial clients.

Verification Tools

Aliasverify

Location: tools/aliasverify/

Binary: fbcode//eden/mononoke/tools/aliasverify:aliasverify

Aliasverify validates content-addressed aliases in the blobstore. File contents in Mononoke are identified by multiple hash algorithms (SHA-1, SHA-256, Blake2, Blake3, Git SHA-1). The tool verifies that:

  • Alias blobs exist for file contents
  • Aliases correctly point to the canonical content
  • Hash computations are accurate

The tool can operate in several modes:

  • Verify aliases for specific changesets
  • Process ranges of changesets
  • Run as a sharded service for large-scale verification

Aliasverify is used to detect and repair missing or corrupted alias data.

Bonsai Verify

Location: tools/bonsai_verify/

Binary: fbcode//eden/mononoke/tools/bonsai_verify:bonsai_verify

Bonsai_verify validates the consistency between Bonsai changesets and derived Mercurial data. The tool:

  • Computes Mercurial manifests from Bonsai changesets
  • Compares computed manifests against stored manifests
  • Validates the Bonsai ↔ Mercurial conversion process
  • Checks manifest structure and file metadata

The tool can verify specific changesets, process ranges, or validate entire repository histories. It reports discrepancies between Bonsai and Mercurial representations, which indicate data corruption or conversion errors.

Check Git Working Copy

Location: tools/check_git_wc/

Binary: fbcode//eden/mononoke/tools/check_git_wc:check_git_wc

This tool validates that a Bonsai changeset produces the same working copy as a Git commit. It:

  • Computes the working copy from a Bonsai changeset
  • Compares against a Git repository's working copy at a specific commit
  • Reports differences in file contents or tree structure
  • Optionally handles Git LFS pointer files

The tool is used to verify the correctness of Git ↔ Bonsai conversions and ensure that Mononoke produces working copies identical to native Git.

Maintenance Tools

Packer

Location: tools/packer/

Binary: fbcode//eden/mononoke/tools/packer:packer

The packer tool manages packblob operations. Packblob is a blobstore decorator that compresses multiple small blobs into larger packed blobs, reducing storage overhead and improving retrieval efficiency.

The tool accepts a set of blob keys and:

  • Fetches the blobs from storage
  • Compresses them using zstd with a specified compression level
  • Packs them into larger multi-blob containers
  • Uploads the packed blobs to the blobstore

The packer supports parallel packing operations and can run in dry-run mode to estimate compression benefits without modifying storage. See blobstore/packblob/README.md for details on the packblob format.

SQLblob Garbage Collection

Location: tools/sqlblob_gc/

Binary: fbcode//eden/mononoke/tools/sqlblob_gc:sqlblob_gc

The sqlblob_gc tool performs garbage collection on SQLblob storage. SQLblob stores blob data in SQL databases (MySQL or SQLite), and over time, unreferenced blobs may accumulate.

The tool provides subcommands to:

  • Identify unreferenced blobs in SQLblob storage
  • Delete garbage blobs to reclaim space
  • Process multiple shards of a sharded SQLblob configuration
  • Run garbage collection with configurable concurrency

This tool is specifically for SQLblob storage and is not applicable to other blobstore types like Manifold or S3.

Backfill Mapping

Location: tools/backfill_mapping/

Binary: fbcode//eden/mononoke/tools/backfill_mapping:backfill_mapping

The backfill_mapping tool populates VCS mapping tables for commits that lack complete mappings. VCS mappings associate Bonsai changesets with their Git or SVN equivalents.

The tool reads a file containing Mercurial changeset IDs and:

  • Resolves them to Bonsai changesets
  • Computes the corresponding Git or SVN identifiers
  • Inserts the mappings into the appropriate mapping tables

This tool is used to repair incomplete mapping data or populate mappings after enabling Git or SVN support for an existing repository.

Data Operations

Executor

Location: tools/executor/

Binary: fbcode//eden/mononoke/tools/executor:executor

The executor tool provides a framework for running tasks across multiple repositories or shards. It handles:

  • Execution of operations across repository sets
  • Sharding and parallel execution
  • Task scheduling and coordination

The executor is used as a framework by other tools that need to operate on multiple repositories concurrently.

Import Tools

Location: tools/import/

The import directory contains shared libraries and utilities used by various import tools (blobimport, gitimport, repo_import). These are not standalone tools but provide common functionality for repository imports.

When to Use Which Tool

For routine operations: Use the admin tool. Most common administrative tasks have corresponding admin subcommands.

For importing repositories:

  • Mercurial repositories: Use blobimport
  • Git repositories: Use gitimport
  • Complex imports with transformations: Use repo_import

For data verification:

  • Content-addressed aliases: Use aliasverify
  • Bonsai ↔ Mercurial consistency: Use bonsai_verify
  • Git working copy validation: Use check_git_wc
  • Graph-wide validation: Use walker (see Jobs and Background Workers)

For storage maintenance:

  • Packblob operations: Use packer
  • SQLblob cleanup: Use sqlblob_gc
  • Storage durability: The blobstore_healer job handles this automatically

For debugging and testing: Use testtool in non-production environments, or use specific admin subcommands in production.

Tool Implementation Patterns

All Mononoke tools follow consistent implementation patterns:

Framework: Tools use cmdlib/mononoke_app/ as the standard application framework, providing:

  • Configuration loading
  • Argument parsing
  • Repository initialization
  • Monitoring and logging setup

Repository Access: Tools use the facet pattern to access repository capabilities. A tool declares which facets it requires, and the framework provides a composed repository object.

Execution: Tools are single-execution programs—they perform their task and exit. Long-running operations should be implemented as jobs rather than tools.

Safety: Tools that modify repository state typically include dry-run modes, validation checks, and logging to prevent accidental data corruption.

Integration with Jobs

Several tools have corresponding background jobs that perform similar operations continuously:

  • aliasverify tool verifies specific changesets; continuous verification runs as a scheduled task
  • blobstore_healer job performs continuous healing; tools can trigger one-off repairs
  • walker job validates repositories continuously; admin tool provides one-off validation commands

The distinction is that tools are invoked on-demand for specific operations, while jobs run continuously for ongoing maintenance.

Documentation and Help

All tools provide command-line help via --help flags. The admin tool's subcommands each have individual help text describing their options and usage.

Component-specific documentation:

  • Tool README files in component directories (when present)
  • Integration tests in tests/integration/ demonstrate tool usage
  • Admin subcommand source in tools/admin/src/commands/ shows available operations

Summary

Mononoke's tools provide operational capabilities for repository management, data import, verification, and maintenance. The admin tool serves as the primary administrative interface, offering a unified command-line API for most common tasks. Specialized tools handle specific operations like repository import, data verification, and storage maintenance.

Tools are distinct from servers (which handle client requests) and jobs (which run continuously). Understanding this distinction helps operators choose the appropriate tool for each task and understand how the components fit into Mononoke's overall architecture.

For background maintenance operations, see Jobs and Background Workers. For server components, see Servers and Services.