.agents/skills/hybrid-cloud-outboxes/references/backfill.md
When a model is migrated to use outboxes (or its replication logic changes), existing rows need outboxes created retroactively. The backfill system handles this incrementally, processing rows in batches with cursor position tracked in Redis and version gating controlled by the sentry options system.
Source file: src/sentry/hybridcloud/tasks/backfill_outboxes.py
replication_version MechanismEvery CellOutboxProducingModel and ControlOutboxProducingModel has a class variable:
replication_version: int = 1 # Default
Two systems work together to control backfills:
(lower_bound_id, current_version) (controls where a backfill resumes)find_replication_version() determines the effective target version:
def find_replication_version(model, force_synchronous=False) -> int:
coded_version = model.replication_version
if force_synchronous:
return coded_version
model_key = f"outbox_replication.{model._meta.db_table}.replication_version"
return min(options.get(model_key), coded_version)
The effective version is min(option_value, coded_version). This means:
force_synchronous=True (self-hosted), the option is bypassed entirelyRedis tracks (lower_bound_id, current_version) per model table:
# Key format:
f"outbox_backfill.{model._meta.db_table}"
# Value: JSON-encoded tuple of (lower_bound_id, current_version)
_chunk_processing_batch() compares the Redis cursor's version against the options-resolved target_version:
version > target_version: backfill already complete, skipversion < target_version: new version detected, reset cursor to 0 and start freshversion == target_version: continue from where we left offTo trigger a backfill: Bump replication_version on the model class:
class MyModel(ReplicatedCellModel):
replication_version = 2 # Was 1; bumping triggers backfill
The option key format is:
f"outbox_replication.{model._meta.db_table}.replication_version"
# Example for OrganizationMember:
"outbox_replication.sentry_organizationmember.replication_version"
Rollout procedure:
replication_versionmin(option_value, coded_version) still returns the old version — no backfill runs yetmin(option_value, coded_version) returns the new version — backfill starts on the next enqueue_outbox_jobs cycleThis two-step process allows deploying code first, then enabling the backfill separately — useful for coordinating with other changes or rolling back quickly by lowering the option.
On self-hosted instances, backfills run synchronously during sentry upgrade via the run_outbox_replications_for_self_hosted function (connected to the post_upgrade signal). This function:
backfill_outboxes_for(force_synchronous=True) — bypasses options, uses model.replication_version directly(0, 1) — no backfill has run (created on first get_processing_state call)(last_processed_id + 1, target_version) — backfill is processing rows(0, replication_version + 1) — all rows processed, version advanced past target(0, new_target_version) and starts from the beginningOUTBOX_BACKFILLS_PER_MINUTE = 10_000
Each batch (via process_outbox_backfill_batch):
_chunk_processing_batch to determine the ID range (low, up) for this batchmodel.objects.filter(id__gte=low, id__lte=up):
inst.outbox_for_update().save() inside outbox_context(flush=False)inst.outboxes_for_update() inside outbox_context(flush=False)(0, replication_version + 1) (marks complete)(up + 1, version)Rate is limited by OUTBOX_BACKFILLS_PER_MINUTE adjusted by the count of already-scheduled outboxes. The backfill_outboxes_for function iterates all registered models and processes batches until the rate limit is reached.
from sentry.hybridcloud.tasks.backfill_outboxes import get_processing_state
lower_bound, version = get_processing_state("sentry_mymodel")
# lower_bound > 0 means backfill is in progress
# version == model.replication_version + 1 means backfill is complete
from sentry import options
# See what version the option is gating to:
options.get("outbox_replication.sentry_mymodel.replication_version")
-- Cell outboxes for a specific category
SELECT count(*) FROM sentry_regionoutbox
WHERE category = <category_value>;
-- Top shards by depth
SELECT shard_scope, shard_identifier, count(*) as depth
FROM sentry_regionoutbox
GROUP BY shard_scope, shard_identifier
ORDER BY depth DESC
LIMIT 10;
backfill_outboxes.low_bound — gauge of the current cursor position per tablebackfill_outboxes.backfilled — counter of rows backfilled per cycleoutbox.saved — counter incremented each time an outbox is savedoutbox.processed — counter incremented each time a coalesced outbox is processedoutbox.processing_lag — histogram of time from outbox creation to processing