Back to Super Productivity

File-Based Sync Architecture

docs/sync-and-op-log/diagrams/04-file-based-sync.md

18.4.422.6 KB
Original Source

File-Based Sync Architecture

Last Updated: January 2026 Status: Implemented

This document contains diagrams explaining the unified operation-log sync architecture for file-based providers (WebDAV, Dropbox, LocalFile).

Overview

File-based sync uses a single sync-data.json file that contains:

  • Full application state snapshot
  • Recent operations buffer (last 200 ops)
  • Vector clock for conflict detection
  • Archive data for late-joining clients
mermaid
flowchart TB
    subgraph Remote["Remote Storage (WebDAV/Dropbox/LocalFile)"]
        subgraph Folder["/superProductivity/"]
            SyncFile["sync-data.json
━━━━━━━━━━━━━━━━━━━
Encrypted + Compressed"]
        end
    end

    subgraph Contents["sync-data.json Contents"]
        direction TB
        Meta["📋 Metadata
• version: 2
• syncVersion: N (locking)
• schemaVersion
• lastModified
• clientId
• checksum"]

        VClock["🕐 Vector Clock
• {clientA: 42, clientB: 17}
• Tracks causality"]

        State["📦 State Snapshot
━━━━━━━━━━━━━━━━━━━
• tasks: TaskState
• projects: ProjectState
• tags: TagState
• notes: NoteState
• globalConfig
• issueProviders
• planner
• simpleCounters
• taskRepeatCfg"]

        Archive["📁 Archive Data
━━━━━━━━━━━━━━━━━━━
• archiveYoung: ArchiveModel
• archiveOld: ArchiveModel
━━━━━━━━━━━━━━━━━━━
Ensures late-joiners get
full archive history"]

        Ops["📝 Recent Operations (last 200)
━━━━━━━━━━━━━━━━━━━
• id, clientId, actionType
• opType, entityType, entityId
• payload, vectorClock
• timestamp
━━━━━━━━━━━━━━━━━━━
Used for conflict detection"]
    end

    SyncFile --> Contents

    style SyncFile fill:#fff3e0,stroke:#e65100,stroke-width:2px
    style State fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
    style Archive fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    style Ops fill:#e1f5fe,stroke:#01579b,stroke-width:2px

Why Single File Instead of Separate Snapshot + Ops Files?

Single File (chosen)Two Files (considered)
Atomic: all or nothingPartial upload risk
One version to trackVersion coordination
Simple conflict resolutionTwo places to handle
Easy recoveryInconsistent state possible
Upload full state each timeOften just ops

The bandwidth cost is acceptable: state compresses well (~90%), and sync is infrequent.

Architecture Overview

Shows how FileBasedSyncAdapter integrates into the existing op-log system, implementing OperationSyncCapable using file operations.

mermaid
flowchart TB
    subgraph Client["Client Application"]
        NgRx["NgRx Store
(Runtime State)"]
        OpLogEffects["OperationLogEffects"]
        OpLogStore["SUP_OPS IndexedDB
(ops + state_cache)"]

        subgraph SyncServices["Sync Services"]
            SyncService["OperationLogSyncService"]
            ConflictRes["ConflictResolutionService"]
            VectorClock["VectorClockService"]
        end

        subgraph ProviderLayer["Provider Abstraction"]
            FileAdapter["FileBasedSyncAdapter
(implements OperationSyncCapable)"]
            SuperSync["SuperSyncProvider
(existing API-based)"]

            subgraph FileProviders["File Providers"]
                WebDAV["WebDAV"]
                Dropbox["Dropbox"]
                LocalFile["LocalFile"]
            end
        end
    end

    subgraph RemoteStorage["Remote Storage"]
        SyncFile["sync-data.json
━━━━━━━━━━━━━━━
• syncVersion
• state snapshot
• recentOps (200)
• vectorClock"]
    end

    NgRx --> OpLogEffects
    OpLogEffects --> OpLogStore
    OpLogStore --> SyncService
    SyncService --> ConflictRes
    SyncService --> VectorClock

    SyncService --> FileAdapter
    SyncService --> SuperSync

    FileAdapter --> WebDAV
    FileAdapter --> Dropbox
    FileAdapter --> LocalFile

    WebDAV --> SyncFile
    Dropbox --> SyncFile
    LocalFile --> SyncFile

    style FileAdapter fill:#e1f5fe,stroke:#01579b,stroke-width:2px
    style SyncFile fill:#fff3e0,stroke:#e65100,stroke-width:2px
    style OpLogStore fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px

TypeScript Types

mermaid
classDiagram
    class FileBasedSyncData {
        +number version = 2
        +number syncVersion
        +number schemaVersion
        +VectorClock vectorClock
        +number lastModified
        +string clientId
        +AppDataComplete state
        +ArchiveModel archiveYoung
        +ArchiveModel archiveOld
        +CompactOperation[] recentOps
        +string checksum
    }

    class AppDataComplete {
        +TaskState task
        +ProjectState project
        +TagState tag
        +GlobalConfigState globalConfig
        +NoteState note
        +IssueProviderState issueProvider
        +PlannerState planner
        +SimpleCounterState simpleCounter
        +TaskRepeatCfgState taskRepeatCfg
    }

    class ArchiveModel {
        +TaskArchive task
        +TimeTrackingState timeTracking
        +number lastTimeTrackingFlush
    }

    class CompactOperation {
        +string id
        +string clientId
        +string actionType
        +OpType opType
        +EntityType entityType
        +string entityId
        +unknown payload
        +VectorClock vectorClock
        +number timestamp
    }

    class VectorClock {
        +Record~string, number~ clocks
    }

    FileBasedSyncData --> AppDataComplete : state
    FileBasedSyncData --> ArchiveModel : archiveYoung?
    FileBasedSyncData --> ArchiveModel : archiveOld?
    FileBasedSyncData --> CompactOperation : recentOps[0..200]
    FileBasedSyncData --> VectorClock : vectorClock
    CompactOperation --> VectorClock : vectorClock

Sync Flow (Content-Based Optimistic Locking with Piggybacking)

mermaid
sequenceDiagram
    participant Client as Client App
    participant Adapter as FileBasedSyncAdapter
    participant Provider as File Provider
    participant Remote as sync-data.json

    Note over Client,Remote: ═══ DOWNLOAD FLOW ═══

    Client->>Adapter: downloadOps(sinceSeq, clientId)
    Adapter->>Provider: downloadFile("sync-data.json")
    Provider->>Remote: GET
    Remote-->>Provider: {data, rev}
    Provider-->>Adapter: SyncData (syncVersion=N)

    Adapter->>Adapter: Update _expectedSyncVersion = N
    Adapter->>Adapter: Filter ops by sinceSeq
    Adapter-->>Client: OpDownloadResponse

    Client->>Client: Apply remote ops to NgRx
    Client->>Client: setLastServerSeq(latestSeq)

    Note over Client,Remote: ═══ UPLOAD FLOW (with Piggybacking) ═══

    Client->>Adapter: uploadOps(ops, clientId, lastKnownSeq)
    Adapter->>Provider: downloadFile("sync-data.json")
    Provider->>Remote: GET
    Remote-->>Provider: {data, rev}
    Provider-->>Adapter: Current syncVersion=M

    alt syncVersion matches expected (M=N)
        Note over Adapter: No other client synced
    else syncVersion changed (M>N)
        Note over Adapter: Another client synced!
Will piggyback their ops
    end

    Adapter->>Adapter: Merge local ops into recentOps
    Adapter->>Adapter: Update vectorClock
    Adapter->>Adapter: Trim recentOps to 200
    Adapter->>Adapter: Set syncVersion = M+1
    Adapter->>Adapter: Find piggybacked ops
(ops from other clients we haven't seen)

    Adapter->>Provider: uploadFile("sync-data.json", newData)
    Provider->>Remote: PUT
    Remote-->>Provider: Success

    Adapter-->>Client: Success + piggybacked ops (newOps)

    alt Has piggybacked ops
        Client->>Client: Process piggybacked ops
        Client->>Client: setLastServerSeq(latestSeq)
    end

Key Insight: Piggybacking

Instead of throwing an error on version mismatch, the adapter:

  1. Merges local ops with whatever is in the file
  2. Returns ops from other clients as newOps (piggybacked)
  3. The upload service processes these before updating lastServerSeq

This ensures no ops are missed, even when clients sync concurrently.

Conflict Resolution (Two Clients Syncing Simultaneously)

mermaid
sequenceDiagram
    participant A as Client A
    participant B as Client B
    participant File as sync-data.json
(syncVersion: 5)

    Note over A,File: Initial: syncVersion=5, both clients synced

    rect rgb(232, 245, 233)
        Note over A,B: Both make offline changes
        A->>A: Create Task X
        A->>A: expectedSyncVersion = 5
        B->>B: Update Task Y
        B->>B: expectedSyncVersion = 5
    end

    Note over A,File: Race condition begins

    A->>File: Upload starts (downloads file, sees v=5)
    B->>File: Upload starts (downloads file, sees v=5)

    A->>A: Merge ops [TaskX], set syncVersion=6
    A->>File: Upload sync-data.json (v=6)
    Note over A,File: A wins the race ✓
    File-->>A: Success
    A->>A: expectedSyncVersion = 6

    B->>B: Merge ops [TaskY]
    Note over B: Downloads file again for upload...
    B->>File: Download (sees syncVersion=6!)
    Note over B,File: Version changed!
Expected 5, found 6

    rect rgb(225, 245, 254)
        Note over B: Piggybacking (not retry!)
        B->>B: Find piggybacked ops from file
(A's TaskX op, seq > lastProcessedSeq)
        B->>B: Merge [TaskX, TaskY] into recentOps
        B->>B: Set syncVersion = 7
        B->>File: Upload sync-data.json (v=7)
        File-->>B: Success ✓

        B->>B: Return piggybacked=[TaskX]
        B->>B: Process TaskX op → apply to NgRx
        B->>B: setLastServerSeq(latestSeq)
    end

    Note over A,File: A syncs B's TaskY on next sync
    A->>File: Download (sinceSeq=6)
    File-->>A: ops=[TaskY]
    A->>A: Apply TaskY → both clients have both tasks

How Piggybacking Resolves Conflicts

StepWhat Happens
1. Version mismatch detectedB expected v=5, found v=6
2. No retry neededB proceeds with merge anyway
3. Find piggybacked opsOps in file with seq > lastProcessedSeq, from other clients
4. Merge and uploadB's ops + file's ops → new file
5. Return piggybackedUpload response includes A's ops
6. Process piggybackedUpload service applies them before advancing lastServerSeq

LWW (Last-Write-Wins) for Same Entity:

If both A and B modified the same task, the piggybacked ops flow through ConflictResolutionService which uses vector clocks and timestamps to determine the winner.

First-Sync Conflict Handling

When a client with local data syncs for the first time to a remote that already has data, a conflict dialog is shown:

mermaid
flowchart TD
    Start[First sync attempt] --> Download[Download sync-data.json]
    Download --> HasLocal{Has local data?}
    HasLocal -->|No| Apply[Apply remote state]
    HasLocal -->|Yes| HasRemote{Remote has data?}
    HasRemote -->|No| Upload[Upload local state]
    HasRemote -->|Yes| Dialog[Show conflict dialog]

    Dialog --> UseLocal[User chooses: Use Local]
    Dialog --> UseRemote[User chooses: Use Remote]

    UseLocal --> CreateImport[Create SYNC_IMPORT
with local state]
    CreateImport --> UploadImport[Upload to remote]

    UseRemote --> ApplyRemote[Apply remote state
Discard local]

    style Dialog fill:#fff3e0,stroke:#e65100,stroke-width:2px

Master Architecture Diagram

mermaid
graph TB
    %% Styles
    classDef client fill:#fff,stroke:#333,stroke-width:2px,color:black;
    classDef provider fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:black;
    classDef storage fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:black;
    classDef conflict fill:#ffebee,stroke:#c62828,stroke-width:2px,color:black;
    classDef success fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:black;

    %% CLIENT SIDE
    subgraph Client["CLIENT (Angular)"]
        direction TB

        subgraph SyncLoop["Sync Loop"]
            Scheduler((Scheduler)) -->|Interval| SyncService["OperationLogSyncService"]
            SyncService -->|1. Get lastSeq| LocalMeta["SUP_OPS IndexedDB"]
        end

        subgraph DownloadFlow["Download Flow"]
            SyncService -->|"2. downloadOps(sinceSeq)"| Adapter
            Adapter -->|Response| VersionCheck{syncVersion
Changed?}
            VersionCheck -- "Yes (reset)" --> GapDetect{Gap Detected?}
            VersionCheck -- "No change" --> FilterOps
            GapDetect -- "Yes + No Ops" --> SnapshotCheck{Has Snapshot
State?}
            GapDetect -- "Yes + Has Ops" --> FilterOps
            SnapshotCheck -- Yes --> LocalDataCheck{Has Local
Unsynced Ops?}
            SnapshotCheck -- No --> FilterOps
            LocalDataCheck -- Yes --> ConflictDialog["Show Conflict Dialog"]:::conflict
            LocalDataCheck -- No --> FreshCheck{Fresh Client?}
            FreshCheck -- Yes --> ConfirmDialog["Confirmation Dialog"]
            FreshCheck -- No --> HydrateSnapshot["Hydrate from Snapshot"]:::success
            ConfirmDialog -- Confirmed --> HydrateSnapshot
            ConfirmDialog -- Cancelled --> SkipSync[Skip]
            ConflictDialog -- "Use Local" --> CreateSyncImport["Create SYNC_IMPORT"]
            ConflictDialog -- "Use Remote" --> HydrateSnapshot
            FilterOps["Filter ops by sinceSeq"]
        end

        subgraph ConflictMgmt["Conflict Management (LWW Auto-Resolution)"]
            FilterOps --> ConflictDet{{"Compare
Vector Clocks"}}:::conflict
            ConflictDet -- Sequential --> ApplyRemote
            ConflictDet -- Concurrent --> LWWCheck{{"LWW: Compare
Timestamps"}}:::conflict

            LWWCheck -- "Remote newer
or tie" --> MarkRejected["Mark Local Rejected"]:::conflict
            LWWCheck -- "Local newer" --> LocalWins["Create Update Op
with local state"]:::conflict
            LocalWins --> RejectBoth["Mark both rejected"]
            RejectBoth --> CreateNewOp["New op syncs to remote"]
            MarkRejected --> ApplyRemote
        end

        subgraph Application["Application & Validation"]
            ApplyRemote -->|Dispatch| NgRx["NgRx Store"]
            HydrateSnapshot -->|"Hydrate full state"| NgRx
            NgRx --> UpdateSeq["setLastServerSeq()"]
            UpdateSeq --> SyncDone((Done))
        end

        subgraph UploadFlow["Upload Flow"]
            LocalMeta -->|Get Unsynced| PendingOps["Pending Ops"]
            PendingOps --> ClassifyOp{Op Type?}

            ClassifyOp -- "SYNC_IMPORT
BACKUP_IMPORT" --> UploadSnapshot["Upload as Snapshot
(full state in file)"]
            ClassifyOp -- "CRT/UPD/DEL" --> MergeOps["Merge into recentOps"]

            MergeOps --> BuildState["Build state snapshot
from NgRx"]
            BuildState --> IncrVersion["syncVersion++"]
            IncrVersion --> UploadFile["Upload sync-data.json"]
            UploadSnapshot --> UploadFile

            UploadFile --> CheckPiggyback{Piggybacked
Ops Found?}
            CheckPiggyback -- Yes --> ProcessPiggyback["Process Piggybacked Ops
(→ Conflict Detection)"]
            ProcessPiggyback --> ConflictDet
            CheckPiggyback -- No --> MarkSynced["Mark Ops Synced"]:::success
        end
    end

    %% FILE PROVIDER LAYER
    subgraph ProviderLayer["FILE PROVIDER LAYER"]
        direction TB

        subgraph Adapter["FileBasedSyncAdapter"]
            DownloadOp["downloadOps()
━━━━━━━━━━━━━━━
• Download file
• Filter by sinceSeq
• Detect version changes
• Return snapshotState if gap"]:::provider
            UploadOp["uploadOps()
━━━━━━━━━━━━━━━
• Download current file
• Merge ops + state
• Increment syncVersion
• Upload merged file
• Return piggybacked ops"]:::provider
            SeqTracking["Sequence Tracking
━━━━━━━━━━━━━━━
• _expectedSyncVersions
• _localSeqCounters
• _syncDataCache"]:::provider
        end

        subgraph Providers["File Providers"]
            WebDAV["WebDAV
━━━━━━━━━━━━
downloadFile()
uploadFile()"]:::provider
            Dropbox["Dropbox
━━━━━━━━━━━━
downloadFile()
uploadFile()"]:::provider
            LocalFile["LocalFile
━━━━━━━━━━━━
downloadFile()
uploadFile()"]:::provider
        end
    end

    %% REMOTE STORAGE
    subgraph Remote["REMOTE STORAGE"]
        direction TB

        SyncFile[("sync-data.json
━━━━━━━━━━━━━━━━━━━
📋 version: 2
📋 syncVersion: N
📋 clientId
━━━━━━━━━━━━━━━━━━━
🕐 vectorClock
━━━━━━━━━━━━━━━━━━━
📦 state (full snapshot)
━━━━━━━━━━━━━━━━━━━
📁 archiveYoung
📁 archiveOld
━━━━━━━━━━━━━━━━━━━
📝 recentOps[0..200]")]:::storage
    end

    %% CONNECTIONS
    Adapter --> WebDAV
    Adapter --> Dropbox
    Adapter --> LocalFile

    WebDAV --> SyncFile
    Dropbox --> SyncFile
    LocalFile --> SyncFile

    CreateSyncImport --> UploadSnapshot

    %% Subgraph styles
    style DownloadFlow fill:#e3f2fd,stroke:#1565c0,stroke-width:2px
    style ConflictMgmt fill:#ffebee,stroke:#c62828,stroke-width:2px
    style UploadFlow fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
    style Application fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    style ProviderLayer fill:#e3f2fd,stroke:#1565c0,stroke-width:2px
    style Remote fill:#fff3e0,stroke:#e65100,stroke-width:2px

Quick Reference Tables

File Operations

OperationMethodPurposeKey Steps
DownloaddownloadOps()Get remote changesDownload file → Filter by sinceSeq → Detect gaps → Return ops or snapshot
UploaduploadOps()Push local changesDownload current → Merge ops → Increment syncVersion → Upload → Return piggybacked
Get SeqgetLastServerSeq()Get processed seqRead from _localSeqCounters map
Set SeqsetLastServerSeq()Update processed seqWrite to _localSeqCounters + persist

sync-data.json Structure

FieldTypePurpose
version2File format version
syncVersionnumberContent-based lock counter (incremented each upload)
schemaVersionnumberApp data schema version (for migrations)
clientIdstringLast client to modify file
lastModifiednumberTimestamp of last modification
vectorClockVectorClockCausal ordering of all operations
stateAppDataCompleteFull application state snapshot
archiveYoungArchiveModel?Tasks archived < 21 days
archiveOldArchiveModel?Tasks archived > 21 days
recentOpsCompactOperation[]Last 200 operations (for conflict detection)
checksumstring?SHA-256 of uncompressed state

Key Implementation Details

FeatureImplementation
Optimistic LockingsyncVersion counter - no server ETags needed
Gap DetectionsyncVersion reset or snapshot replacement triggers re-download from seq=0
PiggybackingOn upload, ops from other clients (seq > lastProcessed) returned as newOps
First-Sync ConflictLocal unsynced ops + remote snapshot → show conflict dialog
Fresh Client SafetyConfirmation dialog before accepting first remote data
LWW ConflictsConcurrent vector clocks → compare timestamps → later wins
Snapshot BootstrapGap detected + has snapshot → hydrate full state (skip ops)
Cache OptimizationDownloaded sync data cached to avoid redundant download before upload
Archive SyncArchive data embedded in file; ArchiveOperationHandler writes to IndexedDB

Key Points

  1. Single Sync File: All data in sync-data.json - state snapshot + recent ops + vector clock
  2. Content-Based Versioning: syncVersion counter detects conflicts without server ETags
  3. Piggybacking on Upload: Version mismatch doesn't throw - ops from other clients are returned as newOps
  4. Sequence Counter Separation:
    • _expectedSyncVersions: Tracks file's syncVersion (for version mismatch detection)
    • _localSeqCounters: Tracks ops we've processed (updated via setLastServerSeq)
  5. Archive via Op-Log: Archive operations sync; ArchiveOperationHandler writes data
  6. Deterministic Replay: Same operation + same timestamp = same result everywhere

Implementation Files

FilePurpose
src/app/op-log/sync-providers/file-based/file-based-sync-adapter.service.tsMain adapter (~800 LOC)
src/app/op-log/sync-providers/file-based/file-based-sync.types.tsTypeScript types and constants
src/app/op-log/sync-providers/file-based/file-based-sync-adapter.service.spec.tsUnit tests