design/validating_restored_data_using_one_cluster.md
Author: Neethu Haneeshi Bingi
The goal is to verify the restored data from a backup with the original source data to ensure the end-to-end correctness of backup and restore flow. While this validation is currently performed in simulation, it is limited in scale and differs from the production backup environment. This new validation flow should:
By doing this, we can validate data reliability in this path and also the restore speed for any regressions.
Store both source and restored data within the same cluster to eliminate the need for a separate validation cluster. Every key-value pair will be compared directly, and exact key/value corruptions or mismatches will be reported.
More detailed steps followed by the diagram [Image: Screenshot 2025-11-12 at 7.53.04 PM.png] The validation consists of three main steps — Backup, Restore, and Compare — followed by a Cleanup phase.
addPrefix parameter to restore into a validation keyspace. When restoring with a prefix, the restore destination empty check is automatically bypassed.audit_storage validate_restore [BeginKey] [EndKey] (beginKey,endKey should be within userKeyRange)
get_audit_status validate_restore progress [AuditID]Note:
Alternative Design Considerations
addPrefix.size() > 0 (indicating a validation restore to a prefixed keyspace). For regular restores without a prefix, the check remains enforced to prevent accidental data loss. Note that all restores (validation or regular) will clear and overwrite any existing data at the destination range - this is standard restore behavior.