docs/replica-movement/schema-guard.md
Keep a shard's two copies identical while one is being moved to another node. This guard stops schema changes that would secretly make the source and target differ — changes the copy mechanism cannot undo.
To scale out (for example, non-HA → HA), Weaviate copies a shard to a new node:
The change log only carries object writes — add, update, delete of your data. It does not carry structural changes to the shard's on-disk shape: turning on vector compression, adding a named vector, deleting an index, freezing a tenant.
If a structural change happens during a move, the source changes shape but the target keeps the old shape. There is no way to reconcile them afterwards. The two replicas now answer queries differently.
Canonical failure: enable vector compression (PQ/BQ/SQ/RQ) mid-copy. The source compresses its vectors; the target stays uncompressed. One replica serves compressed results, the other uncompressed — with no path to converge.
The guard works in both directions, plus one special case that has no user to say "no" to.
Copy a shard: source ──── files ────▶ target
──── change log (object writes only) ────▶
structural changes are NOT in the log
│
┌───────────────────────────┴───────────────────────────┐
│ GUARD │
└───────────────────────────┬───────────────────────────┘
│
┌─────────────────────┐ ┌────────────────────────┐ ┌──────────────────────┐
│ FORWARD │ │ REVERSE │ │ DEFER │
│ schema change while │ │ move starts while a │ │ automatic flat→HNSW │
│ a move is running │ │ structural op is │ │ upgrade (no user) │
│ │ │ already running │ │ │
│ → REJECT │ │ → WAIT, then proceed │ │ → POSTPONE │
│ clear error, │ │ never silently │ │ retry next tick │
│ retry after move │ │ cancelled │ │ after move ends │
└─────────────────────┘ └────────────────────────┘ └──────────────────────┘
flat → HNSW index upgrade fires automatically with no user behind it, so it cannot be "rejected." It is postponed while a move is active and retried on the next scheduler tick.Blocking is scoped to the affected collection — other collections are untouched.
| Operation | During a move |
|---|---|
| Enable / change vector compression (PQ/BQ/SQ/RQ) | Blocked |
| Add or remove a named vector | Blocked |
| Change vector index type or distance | Blocked |
| Disable a property index (filterable/searchable/rangeable) | Blocked |
| Tenant FREEZE / UNFREEZE / HOT→COLD | Blocked |
Dynamic flat → HNSW auto-upgrade | Deferred |
| Start a move while compression is running on the source | Waits |
Start a move while a flat → HNSW upgrade is running | Waits |
These make no on-disk structural change, so they are safe during a move:
ef, efMin, efMax, efFactor, flatSearchCutoffk1 / b, stopwordsautoTenantCreation / autoTenantActivationUpdateClass, UpdateProperty, UpdateTenants). They reject only structurally dangerous changes; safe in-memory changes pass through.compressing (HNSW) and upgrading (dynamic) — mark exactly the window where the on-disk work runs. The transfer gate refuses to snapshot the shard while either is set.While building this guard we found a separate, pre-existing bug: a schemaOnly replay of an "enable compression" change updates the schema but never compresses the on-disk index, so schema and disk can disagree permanently. It is not made worse by this change. It is pinned with a failing test and tracked for a follow-up fix.