eden/mononoke/docs/gitimport_from_bundle_pipeline_guide.md
This guide explains how to use the Skycastle-based gitimport-from-bundle pipeline to import a Git repository into Mononoke from a Git bundle stored in Manifold. The pipeline automates:
All interaction is through the terminal/CLI.
You need the following on your Fedora machine (connected to Corp VPN):
fbsource checkout — An up-to-date checkout of the fbsource repository.
# If you don't have one, follow the standard devserver setup.
# If you already have one, pull the latest:
cd ~/fbsource
sl pull && sl goto master
Manifold CLI — Used to upload bundles to Manifold.
# Install if not already available
sudo dnf install fb-manifold
Verify it works:
manifold --help
Skycastle CLI — Comes with arc. Verify:
arc skycastle --help
VPN connection — You must be connected to the Corp VPN for all commands to work.
If you already have a .bundle or .pack file, skip to Step 2.
To create a bundle from an existing Git repository:
cd /path/to/your/git/repo
# Create a bundle containing all refs
git bundle create /tmp/my_repo.bundle --all
# Verify the bundle is valid
git bundle verify /tmp/my_repo.bundle
Upload the bundle file to a Manifold bucket so the pipeline can fetch it.
manifold --apikey <YOUR_API_KEY> put <BUCKET_NAME>/<KEY_PATH> /tmp/my_repo.bundle
Example:
manifold --apikey upstream_repo_bundles-key put upstream_repo_bundles/flat/my_project_repo /tmp/my_repo.bundle
Verify the upload succeeded:
manifold --apikey upstream_repo_bundles-key ls upstream_repo_bundles/flat/my_project_repo
You should see the file listed with its size.
Parameters explained:
<YOUR_API_KEY> — The Manifold API key for the bucket (e.g., upstream_repo_bundles-key). Ask your team if you don't know it.<BUCKET_NAME> — The Manifold bucket name (e.g., upstream_repo_bundles).<KEY_PATH> — The path/key within the bucket where the bundle will be stored (e.g., flat/my_project_repo).Navigate to your fbsource checkout and run:
cd ~/fbsource
arc skycastle schedule \
//tools/skycastle/workflows2/mononoke/mononoke_gitimport_from_bundle.sky:gitimport_from_bundle \
--flag repo_name="<REPO_NAME>" \
--flag manifold_key="<KEY_PATH>"
| Flag | Description | Example |
|---|---|---|
repo_name | The Mononoke repo name. Use <namespace>/<repo> format. | aosp/platform_build |
manifold_key | The key/path of the bundle within the bucket. | flat/platform_build |
| Flag | Default | Description |
|---|---|---|
manifold_bucket | upstream_repo_bundles | The Manifold bucket where the bundle is stored. |
manifold_apikey | upstream_repo_bundles-key | API key for Manifold access. |
oncall_name | rl_source_control | Oncall team name used during repo creation. |
include_refs | "" (all refs) | Comma-separated list of refs to import. Leave empty to import all. |
hipster_group | "" | Hipster group for ACL setup during repo creation. |
gitimport_concurrency | 200 | Number of concurrent gitimport operations. |
arc skycastle schedule \
//tools/skycastle/workflows2/mononoke/mononoke_gitimport_from_bundle.sky:gitimport_from_bundle \
--flag repo_name="aosp/platform_build" \
--flag manifold_key="flat/platform_build"
arc skycastle schedule \
//tools/skycastle/workflows2/mononoke/mononoke_gitimport_from_bundle.sky:gitimport_from_bundle \
--flag repo_name="aosp/platform_build" \
--flag manifold_bucket="upstream_repo_bundles" \
--flag manifold_key="flat/platform_build" \
--flag manifold_apikey="upstream_repo_bundles-key" \
--flag oncall_name="rl_source_control"
On success, you will see output like:
Success!
Duration: 10 secs
Invocation ID: BnlKsfoe
Workflow run ID: 3422735716815416755
Workflow run URL: https://www.internalfb.com/sandcastle/workflow/3422735716815416755
Execution Job(s): https://www.internalfb.com/intern/sandcastle/job/58546795361304427/
Save the Workflow run URL — you will need it to check the status.
# Check the status of a workflow run by ID
arc skycastle status <WORKFLOW_RUN_ID>
Example:
arc skycastle status 3422735716815416755
Open the Workflow run URL from the scheduling output in your browser:
https://www.internalfb.com/sandcastle/workflow/<WORKFLOW_RUN_ID>
This shows all 6 pipeline steps with their status (pending, running, passed, failed). Click on any step to see its logs.
Open the Execution Job URL from the scheduling output:
https://www.internalfb.com/intern/sandcastle/job/<JOB_ID>/
This provides detailed logs for each action in the pipeline.
The pipeline runs 6 sequential steps:
| Step | Name | What It Does |
|---|---|---|
| 1 | Check or Create Repo | Checks if the repo exists in Mononoke. If not, creates it via SCS thrift and polls for up to 45 minutes for the creation to complete. |
| 2 | Clone from Mononoke | Clones the repo from Mononoke's Git server. If the repo is empty (newly created), initializes a bare repo instead. |
| 3 | Fetch Bundle | Downloads the Git bundle from Manifold to the worker machine. |
| 4 | Unbundle and Gitimport | Unpacks the bundle into the cloned repo and runs gitimport to import all objects and refs into Mononoke. Retries up to 45 times with 60-second intervals if gitimport fails (e.g., waiting for config propagation). |
| 5 | Bust Git Cache | Pushes and deletes a temporary branch to force the Mononoke Git server to refresh its cache. |
| 6 | Verify Import | Compares refs from the bundle with refs on the Mononoke Git server to confirm everything was imported correctly. |
By default, repo creation uses the rl_source_control oncall. If you need a different oncall, override it with --flag oncall_name="<your_oncall>".
The pipeline polls for up to 45 minutes for the repo creation mutation to land. If it times out:
create_repos_poll responses to see the status codes.Gitimport retries up to 45 times with 60-second intervals (total ~45 minutes). Common causes:
--flag gitimport_concurrency=400 or adjust as needed.The final step compares bundle refs with Mononoke Git server refs. If they don't match:
Extra refs in Mononoke: The repo may have had pre-existing refs. This is usually fine.
Missing refs in Mononoke: Gitimport may have skipped some refs. Check the gitimport logs for errors.
Cache not fully refreshed: The cache-busting step may not have fully propagated. Wait a few minutes and manually check with:
git ls-remote https://git.internal.tfbnw.net/repos/git/ro/<REPO_NAME>.git
manifold --apikey <KEY> ls <BUCKET>/<PATH>git bundle verify /path/to/bundlegit.internal.tfbnw.net.Ensure arc is installed and in your PATH:
# On Fedora, install arc tools
sudo dnf install fb-arcanist
All commands require Corp VPN. Verify connectivity:
# Test access to internal services
curl -s -o /dev/null -w "%{http_code}" https://git.internal.tfbnw.net
If you get connection errors, reconnect your VPN and retry.
The pipeline is configured with 3 automatic infra-level retries. If a failure is due to infrastructure issues (machine problems, network blips), it will automatically retry the entire workflow. User failures (bad input, missing permissions) will not be retried.