docs/sync-and-op-log/background-info/synthesized-delta-sync-analysis.md
Synthesized From: Gemini 2.5 Flash, GPT-5, Claude Opus 4.5 Date: December 2, 2025
The three AI models analyzed different artifacts:
| Model | What Was Analyzed | Branch/Source |
|---|---|---|
| Gemini 2.5 Flash | Current working directory | feat/delta-sync (45-line stub) |
| GPT-5 | Documentation & design docs | Theoretical/planned design |
| Claude Opus 4.5 | feat/sync-server branch via git show | 940-line implementation (different branch) |
Current Reality (as of analysis date):
feat/delta-sync branch (the branch being evaluated): super-sync.ts is 45 lines (WebDAV stub)feat/sync-server branch (separate, not under evaluation): Contains 940-line implementationpackages/super-sync-server): Auth wrapper only, no delta/changes logicImportant Clarification: The correct comparison should be between feat/delta-sync and feat/operation-logs. The feat/sync-server branch is a separate implementation effort and should not be conflated with feat/delta-sync.
Implication: Gemini's finding that the delta-sync implementation is "vaporware" is correct for feat/delta-sync. Opus's detailed code analysis applies to a different branch (feat/sync-server) which contains more code but is not the branch under evaluation. GPT-5's analysis is based on documented design.
The strategic recommendation (abandon delta sync on feat/delta-sync) is valid, because:
feat/delta-sync branch has no real implementation—just a stub (Gemini)feat/sync-server) has fundamental issues (Opus)This document synthesizes analyses from three AI models (Gemini 2.5 Flash, GPT-5, Claude Opus 4.5) examining the delta-sync implementation in Super Productivity. All three models independently identified fundamental architectural issues that explain why stabilization has proven difficult—whether analyzing the stub, the design, or the implementation branch.
| Finding | Gemini | GPT-5 | Opus | Confidence |
|---|---|---|---|---|
| Shadow state is the core problem | ✅ | ✅ | ✅ | High |
| Watermark/revision tracking is unreliable | ✅ | ✅ | ✅ | High |
| LWW semantics cause data loss | ✅ | ✅ | ✅ | High |
| O(N) diffing doesn't scale | ✅ | ✅ | ✅ | High |
| Implementation is incomplete | ✅ | ✅ | ⚠️ | High |
| Multi-state consistency is impossible | ✅ | ✅ | ✅ | High |
Key Insight (Consensus): The delta-sync architecture requires maintaining multiple synchronized states (app data, shadow state, watermarks, vector clocks) without transactional guarantees. This creates a combinatorial explosion of failure modes that are difficult to test and reproduce.
All three models identify this as the primary root cause.
"The Delta Sync architecture requires the client to maintain a Shadow State—a perfect local copy of the data as it exists on the server. Without a durable shadow state, the client loses its 'diffing baseline' on every app restart."
"Shadow state is easily lost (IDB eviction, encryption key mismatch, cache cleared, new device). When missing, the client interprets 'no shadow' as 'everything changed,' triggering full-entity uploads."
"Dual-Cache Inconsistency: Shadow state exists both in-memory (
lastSyncedStateMap) and in IndexedDB. These can drift apart if app crashes after memory update but before IDB write."
| Failure Mode | Gemini | GPT-5 | Opus |
|---|---|---|---|
| IDB eviction/loss | ✅ | ✅ | ✅ |
| Encryption key mismatch | — | ✅ | ✅ |
| Memory/IDB drift | — | — | ✅ |
| No integrity verification | — | — | ✅ |
| Silent recovery masks issues | ✅ | ✅ | ✅ |
Consolidated Root Cause: Shadow state has no durability guarantees, no integrity verification, and no atomic relationship with watermarks or server state. Any corruption silently triggers full syncs, which may overwrite local changes.
All three models identify watermark drift as a critical issue.
"If Shadow State corrupts, the client is broken until full reset."
"If local watermarks drift (app restart during sync, partial writes), the client may ask the server for changes the server has already compacted or skip ranges entirely. That produces 'empty change set but stale shadow' scenarios."
"Watermark-Shadow Desynchronization: Watermark and shadow state are updated separately. If one update succeeds and the other fails, the client enters an inconsistent state."
| Failure Mode | Impact |
|---|---|
| Watermark newer than shadow | Client misses changes (thinks it's up-to-date) |
| Watermark older than shadow | Client re-downloads changes (inefficient but safe) |
| Watermark drift during sync | Client and server disagree on sync point |
| No atomic coupling | Crash mid-sync corrupts state permanently |
Consolidated Root Cause: Watermarks and shadow state are stored in separate IDB transactions. There is no mechanism to ensure they remain consistent after crashes or partial failures.
All three models identify shallow merge semantics as a data-loss risk.
"'Last Write Wins' blindly overwrites data. No context of why a change happened. A
TaskCompletedop andTaskRenamedop cannot both be applied—one wipes out the other."
"Server applies
merged = { ...oldData, ...newData }, so correctness depends on clients sending full values for every top-level property. When it misses a nested field, the server overwrites the old object with the partial payload, dropping untouched keys."
"BLOB models use last-write-wins (whole object replacement), while ENTITY models use field-level merging. If mode detection is wrong, data is either over-merged or under-merged."
Scenario: Two devices edit Task A concurrently
Device 1: Renames task to "Important Meeting"
Device 2: Marks task as completed
Delta Sync Result (LWW):
- If Device 2 syncs last → Task renamed but NOT completed (Device 1's change lost)
- If Device 1 syncs last → Task completed but NOT renamed (Device 2's change lost)
Expected Result:
- Task should be BOTH renamed AND completed (independent changes)
Consolidated Root Cause: Delta sync transmits state diffs, not intent. When two devices modify different properties of the same entity, only one change survives. This is a fundamental limitation of state-based sync with shallow merge.
All three models identify O(N) diffing as a scalability bottleneck.
"Diffing 10k items freezes UI. Performance degrades linearly with data size."
"For thousands of tasks, diffing pegs the UI thread, causing frame drops and timeouts. If data changes mid-diff, the computed delta no longer matches the final state."
"Diff calculation (
createDiff) is O(N) where N = number of entities. For users with 10,000+ tasks, this can block the main thread."
| Dataset Size | Diff Time (estimated) | User Impact |
|---|---|---|
| 100 tasks | ~10ms | Imperceptible |
| 1,000 tasks | ~100ms | Minor delay |
| 10,000 tasks | ~1,000ms | UI freeze |
| 50,000 tasks | ~5,000ms | Unusable |
Consolidated Root Cause: The diff algorithm must compare every entity against its shadow counterpart using JSON.stringify(). This is inherently O(N) and cannot be optimized without fundamental architecture changes (dirty tracking, incremental diffing, web workers).
Gemini and GPT-5 note that the implementation is incomplete. Opus analyzed the more complete feat/sync-server branch.
"Code analysis reveals the Server has NO delta DB (only Users table) and Client is just a WebDAV wrapper. The implementation is effectively 'vaporware'."
"The actual code path (
super-sync.ts) is currently just a thin WebDAV wrapper—none of the documented delta logic is wired in."
"The
feat/sync-serverbranch contains a 940-lineSuperSyncProviderwith IDB shadow state, watermarks, and diff logic. However, multiple TODOs indicate incomplete areas."
| Component | Current State | Gap |
|---|---|---|
| Server delta API | Documented but minimal | No changes table in Gemini's analysis |
| Client shadow state | Implemented in feat/sync-server | No crash consistency |
| Diff engine | Implemented | No worker offload, O(N) |
| Watermark tracking | Implemented | Not atomic with shadow |
| Encryption | Two modes implemented | Mode switching is implicit |
| Vector clocks | Implemented | Governance gaps (empty clocks = conflict) |
Note: The discrepancy between Gemini/GPT-5 and Opus suggests the implementation has evolved. Opus analyzed a more recent state of feat/sync-server where delta logic exists but has fundamental issues.
All three models agree that stabilization faces structural barriers:
"The system attempts to infer intent from mutable snapshots, so any shadow corruption forces expensive recomputation and can silently lose semantics."
| State Component | Storage | Can Drift? |
|---|---|---|
| NgRx in-memory state | Memory | Yes |
| IndexedDB app data | IDB | Yes |
| Shadow state (memory) | Memory | Yes |
| Shadow state (IDB) | IDB | Yes |
| Watermarks | IDB | Yes |
| Server state | Remote | Yes |
| Vector clocks (local) | IDB | Yes |
| Vector clocks (remote) | Remote | Yes |
Total independent states: 8 Possible inconsistent combinations: 2^8 - 1 = 255
| Edge Case | Testable? | Currently Tested? |
|---|---|---|
| Concurrent pushes from 2 devices | Difficult | ❌ |
| Network failure mid-sync | Difficult | ❌ |
| IDB eviction | Difficult | ❌ |
| Vector clock overflow | Medium | ⚠️ (disabled) |
| Encryption key change | Medium | ❌ |
| Large dataset (10k+ tasks) | Difficult | ❌ |
"The design spans WebDAV fallbacks, optional encryption, IndexedDB persistence, and REST deltas. Each layer introduces its own failure modes, and they compound."
Gemini noted that the SuperSyncProvider is currently "Phase 0"—a WebDAV wrapper—indicating the delta sync was never fully implemented. This suggests the project hit a complexity wall during implementation.
"Switching modes changes merge semantics (snapshot LWW vs. delta patches). After fallback, the shadow state no longer matches server revisions."
This highlights that having two sync modes (delta and WebDAV fallback) creates hybrid states that neither path fully owns.
Opus identified specific code-level issues in vector clock handling:
CONCURRENT (triggers false conflicts)"Server (4-6 wks): Design changes schema, implement
/api/syncendpoints. Client (6-8 wks): ImplementShadowStateStore, Diff Engine (Worker), Partial Patching. Migration (2 wks)."
"Implement and persist shadow state + per-model watermarks with crash-safe coupling. Add diff+merge pipeline. Add large-dataset perf tests. Harden with soak tests."
"Stabilization would require either deep architectural changes to make state transitions atomic and verifiable, OR switching to a different synchronization paradigm."
| Approach | Effort | Risk | Outcome |
|---|---|---|---|
| Minimal fixes (bug-by-bug) | 2-4 weeks | High | Whack-a-mole; new bugs emerge |
| Proper delta sync (GPT-5) | 3-5 weeks | Medium | Viable but fragile |
| Full rebuild (Gemini) | 12-16 weeks | Medium | Proper delta sync |
| Architecture switch (Opus) | 4-6 weeks | Low | Operation log approach |
operationlog-critique.md)All three AI models independently reached the same conclusion: The delta-sync implementation has fundamental architectural issues that make stabilization expensive and risky.
The operation-log approach (feat/operation-logs) provides a more tractable path because:
The delta-sync approach is not inherently wrong, but the current implementation would require significant rework to achieve stability—effort that may be better spent completing the operation-log implementation.