docs/sync-and-op-log/long-term-plans/hybrid-manifest-architecture.md
Status: Completed (December 2025)
Context: Optimizing WebDAV/Dropbox sync for the Operation Log architecture. Related: Operation Log Architecture
Implementation Note: This architecture is fully implemented in
OperationLogManifestService,OperationLogUploadService, andOperationLogDownloadService. The embedded operations buffer, overflow file creation, and snapshot support are all operational.
The current OperationLogSyncService fallback for file-based providers (WebDAV, Dropbox) is inefficient for frequent, small updates.
Current Workflow (Naive Fallback):
ops/ops_CLIENT_TIMESTAMP.json.ops/manifest.json to get current list.ops/manifest.json with the new filename added.Issues:
Instead of treating the manifest solely as an index of files, we treat it as a buffer for recent operations.
manifest.json.ops_*.json) are only created when the manifest buffer fills up.Updated Manifest:
interface HybridManifest {
version: 2;
// The baseline state (snapshot). If present, clients load this first.
lastSnapshot?: SnapshotReference;
// Ops stored directly in the manifest (The Buffer)
// Limit: ~50 ops or 100KB payload size
embeddedOperations: EmbeddedOperation[];
// References to external operation files (The Overflow)
// Older ops that were flushed out of the buffer
operationFiles: OperationFileReference[];
// Merged vector clock from all embedded operations
// Used for quick conflict detection without parsing all ops
frontierClock: VectorClock;
// Last modification timestamp (for ETag-like cache invalidation)
lastModified: number;
}
interface SnapshotReference {
fileName: string; // e.g. "snapshots/snap_1701234567890.json"
schemaVersion: number; // Schema version of the snapshot
vectorClock: VectorClock; // Clock state at snapshot time
timestamp: number; // When snapshot was created
}
interface OperationFileReference {
fileName: string; // e.g. "ops/overflow_1701234567890.json"
opCount: number; // Number of operations in file (for progress estimation)
minSeq: number; // First operation's logical sequence in this file
maxSeq: number; // Last operation's logical sequence
}
// Embedded operations are lightweight - full Operation minus redundant fields
interface EmbeddedOperation {
id: string;
actionType: string;
opType: OpType;
entityType: EntityType;
entityId?: string;
entityIds?: string[];
payload: unknown;
clientId: string;
vectorClock: VectorClock;
timestamp: number;
schemaVersion: number;
}
Snapshot File Format:
interface SnapshotFile {
version: 1;
schemaVersion: number; // App schema version
vectorClock: VectorClock; // Merged clock at snapshot time
timestamp: number;
data: AppDataComplete; // Full application state
checksum?: string; // Optional SHA-256 for integrity verification
}
When a client has local pending operations to sync:
┌─────────────────────────────────────────────────────────────────┐
│ Upload Flow │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 1. Download manifest.json │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 2. Detect remote changes │
│ (compare frontierClock) │
└───────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
Remote has new ops? No remote changes
│ │
▼ │
Download & apply first ◄───────┘
│
▼
┌───────────────────────────────┐
│ 3. Check buffer capacity │
│ embedded.length + pending │
└───────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
< BUFFER_LIMIT (50) >= BUFFER_LIMIT
│ │
▼ ▼
Append to embedded Flush embedded to file
│ + add pending to empty buffer
│ │
└───────────────┬───────────────┘
▼
┌───────────────────────────────┐
│ 4. Check snapshot trigger │
│ (operationFiles > 50 OR │
│ total ops > 5000) │
└───────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
Trigger snapshot No snapshot needed
│ │
└───────────────┬───────────────┘
▼
┌───────────────────────────────┐
│ 5. Upload manifest.json │
└───────────────────────────────┘
Detailed Steps:
manifest.json (or create empty v2 manifest if not found).manifest.frontierClock with local lastSyncedClock.BUFFER_LIMIT = 50 operations (configurable)BUFFER_SIZE_LIMIT = 100KB payload size (prevents manifest bloat)embedded.length + pending.length < BUFFER_LIMIT:
pendingOps to manifest.embeddedOperations.manifest.frontierClock with merged clocks.manifest.embeddedOperations to new file ops/overflow_TIMESTAMP.json.manifest.operationFiles.pendingOps into now-empty manifest.embeddedOperations.manifest.json.When a client checks for updates:
┌─────────────────────────────────────────────────────────────────┐
│ Download Flow │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 1. Download manifest.json │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 2. Quick-check: any changes? │
│ Compare frontierClock │
└───────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
No changes (clocks equal) Changes detected
│ │
▼ ▼
Done ┌────────────────────────┐
│ 3. Need snapshot? │
│ (local behind snapshot)│
└────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
Download snapshot Skip to ops
+ apply as base │
│ │
└───────────────┬───────────────┘
▼
┌────────────────────────┐
│ 4. Download new op │
│ files (filter seen) │
└────────────────────────┘
│
▼
┌────────────────────────┐
│ 5. Apply embedded ops │
│ (filter by op.id) │
└────────────────────────┘
│
▼
┌────────────────────────┐
│ 6. Update local │
│ lastSyncedClock │
└────────────────────────┘
Detailed Steps:
manifest.json.manifest.frontierClock against local lastSyncedClock.manifest.lastSnapshot.vectorClock → download snapshot first.manifest.operationFiles to only files with maxSeq > localLastAppliedSeq.manifest.embeddedOperations by op.id (skip already-applied).vectorClock (causal order).detectConflicts() logic.localLastSyncedClock = manifest.frontierClock.To prevent unbounded growth of operation files, any client can trigger a snapshot.
| Condition | Threshold | Rationale |
|---|---|---|
External operationFiles count | > 50 | Prevent WebDAV directory bloat |
| Total operations since snapshot | > 5000 | Bound replay time for fresh installs |
| Time since last snapshot | > 7 days | Ensure periodic cleanup |
| Manifest size | > 500KB | Prevent manifest from becoming too big |
┌─────────────────────────────────────────────────────────────────┐
│ Snapshot Flow │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 1. Ensure full sync complete │
│ (no pending local/remote) │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 2. Read current state from │
│ NgRx (authoritative) │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 3. Generate snapshot file │
│ + compute checksum │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 4. Upload snapshot file │
│ (atomic, verify success) │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 5. Update manifest │
│ - Set lastSnapshot │
│ - Clear operationFiles │
│ - Clear embeddedOperations │
│ - Reset frontierClock │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 6. Upload manifest │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ 7. Cleanup (async, best- │
│ effort): delete old files │
└───────────────────────────────┘
Problem: If the client crashes between uploading snapshot and updating manifest, other clients won't see the new snapshot.
Solution: Snapshot files are immutable and safe to leave orphaned. The manifest is the source of truth. Cleanup is best-effort.
Invariant: Never delete the current lastSnapshot file until a new snapshot is confirmed.
The hybrid manifest doesn't change conflict detection - it still uses vector clocks. However, the frontierClock in the manifest enables early conflict detection.
Before downloading all operations, compare clocks:
const comparison = compareVectorClocks(localFrontierClock, manifest.frontierClock);
switch (comparison) {
case VectorClockComparison.LESS_THAN:
// Remote is ahead - safe to download
break;
case VectorClockComparison.GREATER_THAN:
// Local is ahead - upload our changes
break;
case VectorClockComparison.CONCURRENT:
// Potential conflicts - download ops for detailed analysis
break;
case VectorClockComparison.EQUAL:
// No changes - skip download
break;
}
When conflicts are detected at the operation level, the existing ConflictResolutionService handles them. The hybrid manifest doesn't change this flow.
Scenario: Two clients download the manifest simultaneously, both append ops, both upload.
Problem: Second upload overwrites first client's operations.
Solution: Use provider-specific mechanisms:
| Provider | Mechanism |
|---|---|
| Dropbox | Use update mode with rev parameter |
| WebDAV | Use If-Match header with ETag |
| Local | File locking (already implemented in PFAPI) |
Implementation:
interface HybridManifest {
// ... existing fields
// Optimistic concurrency control
etag?: string; // Server-assigned revision (Dropbox rev, WebDAV ETag)
}
async uploadManifest(manifest: HybridManifest, expectedEtag?: string): Promise<void> {
// If expectedEtag provided, use conditional upload
// On conflict (412 Precondition Failed), re-download and retry
}
Scenario: Manifest JSON is invalid (partial write, encoding issue).
Recovery Strategy:
manifest.json.bak).listFiles().async loadManifestWithRecovery(): Promise<HybridManifest> {
try {
return await this._loadRemoteManifest();
} catch (parseError) {
PFLog.warn('Manifest corrupted, attempting recovery...');
// Try backup
try {
return await this._loadBackupManifest();
} catch {
// Reconstruct from files
return await this._reconstructManifestFromFiles();
}
}
}
Scenario: Manifest references a snapshot that doesn't exist on the server.
Recovery Strategy:
Scenario: Snapshot was created with schema version 3, but local app is version 2.
Handling:
snapshot.schemaVersion > CURRENT_SCHEMA_VERSION + MAX_VERSION_SKIP:
snapshot.schemaVersion > CURRENT_SCHEMA_VERSION:
snapshot.schemaVersion < CURRENT_SCHEMA_VERSION:
Scenario: User was offline for a week, has 500 pending operations.
Handling:
const BATCH_SIZE = 100;
const chunks = chunkArray(pendingOps, BATCH_SIZE);
for (const chunk of chunks) {
await this._uploadOverflowFile(chunk);
}
// Single manifest update at the end
await this._uploadManifest(manifest);
| Metric | Current (v1) | Hybrid Manifest (v2) |
|---|---|---|
| Requests per Sync | 3 (Upload Op + Read Man + Write Man) | 1-2 (Read Man, optional Write) |
| Files on Server | Unbounded growth | Bounded (1 Manifest + 0-50 Op Files + 1 Snapshot) |
| Fresh Install Speed | O(n) - replay all ops | O(1) - load snapshot + small delta |
| Conflict Detection | Must parse all ops | Quick check via frontierClock |
| Bandwidth per Sync | ~2KB (op file) + manifest overhead | ~1KB (manifest only for small changes) |
| Offline Resilience | Good | Same (operations buffered locally) |
All phases have been implemented as of December 2025:
Types (operation.types.ts):
HybridManifest, SnapshotReference, OperationFileReference interfaces definedOperationLogManifestManifest Handling (operation-log-manifest.service.ts):
loadManifest() handles v1 and v2 formatsFrontierClock Tracking:
lastSyncedFrontierClock stored locally for quick-checkSnapshot Operations (in operation-log-upload.service.ts and operation-log-download.service.ts):
Snapshot Triggers:
REMOTE_OP_FILE_RETENTION_MS)Concurrency Control:
Recovery Logic:
operation-log-manifest.service.spec.tssync-scenarios.integration.spec.tssupersync.spec.ts| File | Purpose |
|---|---|
operation-log-manifest.service.ts | Manifest loading, saving, buffer management |
operation-log-upload.service.ts | Upload with buffer/overflow logic |
operation-log-download.service.ts | Download with snapshot support |
operation.types.ts | Type definitions |
// Buffer limits
const EMBEDDED_OP_LIMIT = 50; // Max operations in manifest buffer
const EMBEDDED_SIZE_LIMIT_KB = 100; // Max payload size in KB
// Snapshot triggers
const SNAPSHOT_FILE_THRESHOLD = 50; // Trigger when operationFiles exceeds this
const SNAPSHOT_OP_THRESHOLD = 5000; // Trigger when total ops exceed this
const SNAPSHOT_AGE_DAYS = 7; // Trigger if no snapshot in N days
// Batching
const UPLOAD_BATCH_SIZE = 100; // Ops per overflow file
// Retry
const MAX_UPLOAD_RETRIES = 3;
const RETRY_DELAY_MS = 1000;
The following questions were resolved during implementation:
Encryption: Snapshots use the same encryption as operation files (via EncryptAndCompressHandlerService).
Compression: Snapshots are compressed using the same compression scheme as other sync files.
Checksum Verification: Currently using timestamp-based validation; checksums can be added if needed.
Clock Drift: Vector clocks handle ordering; timestamps are informational only.
/ (or /DEV/ in development)
├── manifest.json # HybridManifest (buffer + references)
├── ops/
│ ├── ops_CLIENT1_170123.json # Flushed operations
│ └── ops_CLIENT2_170456.json
└── snapshots/
└── snap_170789.json # Full state snapshot (if present)
src/app/op-log/
├── operation.types.ts # HybridManifest, SnapshotReference types
├── store/
│ └── operation-log-manifest.service.ts # Manifest management
├── sync/
│ ├── operation-log-upload.service.ts # Upload with buffer/overflow
│ └── operation-log-download.service.ts # Download with snapshot support
└── docs/
└── hybrid-manifest-architecture.md # This document