.tasks/core/LSYNC-010-sync-service.md
Implement the peer sync service using the new leaderless hybrid model. All devices are equals that broadcast changes to peers using two strategies:
Architecture: Peer-to-peer broadcast with no leader/follower roles.
SyncService struct (no role field!)sync_partners table for peer listbroadcast_state_change() - sends to all peerson_state_change_received() - applies peer's statebroadcast_shared_change() - with HLC orderingon_shared_change_received() - applies with conflict resolutionshared_changes.db when all peers ackdevicescore/src/service/sync/
mod.rs - SyncService (no leader/follower split!)state.rs - State-based sync logicshared.rs - Log-based sync with HLChlc.rs - HLC generatorDevice A creates location:
1. INSERT INTO locations (device_id=A, ...)
2. Emit: LocationCreated event
3. SyncService.on_location_created()
4. Broadcast StateChange to all devices
5. Done! (no log)
Peers (B, C):
1. Receive StateChange
2. INSERT INTO locations (device_id=A, ...)
3. Emit event → UI updates
Device A creates tag:
1. Generate HLC(1000,A)
2. INSERT INTO tags (...)
3. INSERT INTO shared_changes (hlc, ...)
4. Broadcast SharedChange to all devices
Peers (B, C):
1. Receive SharedChange
2. Update local HLC
3. INSERT INTO tags (...) with merge
4. Send ACK to Device A
Device A:
1. Receive ACKs from all
2. DELETE FROM shared_changes WHERE hlc <= 1000
3. Log stays small!
pub struct SyncService {
library_id: Uuid,
// No role field!
protocol_handler: Arc<SyncProtocolHandler>,
event_bus: Arc<EventBus>,
// HLC generator for shared changes
hlc_generator: Arc<Mutex<HLCGenerator>>,
// Shared changes log (per-device, small)
shared_changes_db: Arc<SharedChangesDb>,
// Track peer states
peer_states: Arc<RwLock<HashMap<Uuid, PeerSyncState>>>,
// Pending state broadcasts (batched)
pending_states: Arc<Mutex<Vec<StateChange>>>,
}
impl SyncService {
/// Create and start sync service
pub async fn start(
library_id: Uuid,
protocol_handler: Arc<SyncProtocolHandler>,
) -> Result<Self, SyncError>;
/// Broadcast device-owned state change
async fn broadcast_state_change(&self, change: StateChange);
/// Broadcast shared resource change (with HLC)
async fn broadcast_shared_change(&self, change: SharedChange);
/// Handle received state change
async fn on_state_change(&self, change: StateChange);
/// Handle received shared change
async fn on_shared_change(&self, change: SharedChange);
}
Successfully implemented in core/src/service/sync/peer.rs:
Broadcast Improvements:
futures::join_all (was sequential).unwrap_or_default())State-Based Sync:
broadcast_state_change() sends to all peers in parallelon_state_change_received() applies via registryLog-Based Sync:
broadcast_shared_change() generates HLC and sends to all peerson_shared_change_received() applies with conflict resolutionon_ack_received() tracks peer ACKs for pruningCompletion Estimate: ~90% (core broadcast + service lifecycle + watermark schema + connection event handlers + watermark exchange protocol + network event wiring complete, PeerLog IS the persistent queue by design, only backfill response correlation and minor optimizations remaining)
Detailed gap analysis to ensure nothing gets lost:
1. Service Lifecycle Integration COMPLETE (Oct 14, 2025)
Library::init_sync_service() creates and starts SyncService (mod.rs:108-145)Library::shutdown() stops sync service gracefully (mod.rs:247-253)SyncService::run_sync_loop() provides orchestration and automatic backfill detection (service/sync/mod.rs:116-209)2. Connection State Management ~90% COMPLETE ✅
devices table (NOT a separate sync_partners table)last_state_watermark TIMESTAMP columnlast_shared_watermark TEXT columndevices.is_online and devices.last_seen_at on connection/disconnection3. Startup Sync / Reconnection Logic COMPLETE (Oct 15, 2025)
devices table (Oct 14, 2025)get_watermarks() queries devices table (peer.rs:131-173)exchange_watermarks_and_catchup() fully implemented (peer.rs:175-215)on_watermark_exchange_response() with divergence detection (peer.rs:217-340)trigger_watermark_exchange() static method (peer.rs:728-805)update_peer_watermarks() updates devices table (peer.rs:343-382)4. Backfill Network Integration ~60% COMPLETE (Oct 15, 2025)
request_state_batch() sends StateRequest via NetworkTransport (line 219-261)request_shared_changes() sends SharedChangeRequest via NetworkTransport (line 263-297)5. Watermark Tracking COMPLETE (Oct 15, 2025)
6. Batching Optimization
7. Checkpoint Persistence
8. Initial Backfill Trigger COMPLETE (Oct 14, 2025)
run_sync_loop() checks for DeviceSyncState::Uninitializedget_connected_sync_partners() from network (source: devices table)PeerInfo for peer selectionBackfillManager::start_backfill() with available peers9. Heartbeat Health Monitoring
10. Incremental State Sync
since param but never used with actual timestampsHere's the full sync lifecycle with gaps marked:
┌─────────────────────────────────────────────────────────────┐
│ Phase 1: Library Open │
│ Library::init_sync_service() called │
│ SyncService created and started │
│ Late initialization if networking loads after │
│ run_sync_loop spawned for orchestration │
│ → COMPLETE: Sync service runs properly │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 2: Initial Backfill (New Device) │
│ run_sync_loop detects Uninitialized state │
│ Automatic backfill trigger when peers available │
│ request_state_batch() is stub │
│ request_shared_changes() is stub │
│ Checkpoint save/load not implemented │
│ PeerSync.transition_to_ready() works │
│ Buffer processing works │
│ → PARTIAL: Detection works, but network requests stubbed │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 3: Ready State (Normal Operation) │
│ Broadcast works (parallel sends, timeouts) │
│ Receive works (via registry) │
│ ACK mechanism works │
│ Retry queue works (background processor) │
│ Log pruning works (periodic background task) │
│ No batching (100ms window) │
│ State watermark always None │
│ → WORKS: Happy path with 2+ always-online devices │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 4: Peer Disconnection │
│ Connection state tracking (event handlers) │
│ on_peer_disconnected() implemented │
│ Changes queued in PeerLog (sync.db) by design │
│ ACK mechanism prevents premature pruning │
│ Retry queue handles temporary failures │
│ Event receiver wiring to NetworkingService │
│ → WORKS: Offline peer support built into architecture │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 5: Reconnection / Startup Sync │
│ Watermark exchange protocol implemented │
│ Incremental shared catch-up (HLC-based) │
│ trigger_watermark_exchange() on connection │
│ Divergence detection working │
│ → WORKS: Devices catch up after offline periods │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 6: Library Close │
│ Library::shutdown() stops sync service │
│ SyncService::stop() signals shutdown and waits │
│ PeerSync::stop() called to halt background tasks │
│ → COMPLETE: Graceful shutdown implemented │
└─────────────────────────────────────────────────────────────┘
Reality Check:
Overall: ~90% complete. Service lifecycle, watermark schema, connection event handlers, network event wiring, offline peer handling, and watermark exchange protocol complete. Only backfill response correlation and minor optimizations remaining.
All items implemented in core/src/lib.rs and core/src/library/mod.rs:
Status: Complete - sync service runs properly ✅
Use existing devices table as source of truth (NOT a separate sync_partners table):
Database Schema COMPLETE (Oct 14, 2025):
devices table in unified migration:
last_state_watermark TIMESTAMP (device-owned data sync tracking)last_shared_watermark TEXT (HLC-based shared resource sync, stored as JSON)core/src/infra/db/entities/device.rs):
pub last_state_watermark: Option<DateTimeUtc>pub last_shared_watermark: Option<String>core/src/domain/device.rs):
os_version, capabilities, sync_enabled, last_sync_at, watermarksFiles Modified:
core/src/infra/db/migration/m20240101_000001_unified_schema.rs - added watermark columnscore/src/infra/db/entities/device.rs - added watermark fields to Model and Syncablecore/src/domain/device.rs - added all missing sync-related fieldsPeerSync Implementation COMPLETE (Oct 14, 2025):
network_events receiver field to PeerSync structset_network_events() method to inject event receiverstart_network_event_listener() - spawns background taskNetworkEvent::ConnectionEstablished and NetworkEvent::ConnectionLosthandle_peer_connected(device_id, db) static handler:
devices.is_online = truedevices.last_seen_at = now()devices.updated_at = now()handle_peer_disconnected(device_id, db) static handler:
devices.is_online = falsedevices.last_seen_at = now()devices.updated_at = now()Files Modified:
core/src/service/sync/peer.rs - added network event listener and connection handlersNetwork Event Wiring COMPLETE (Oct 15, 2025):
core/src/lib.rs:274-280 - Wires network events after sync service init in Core::new_with_config()core/src/library/manager.rs:77-86 - Wires network events in LibraryManager::open_library()peer_sync.set_network_events(networking.subscribe_events())Offline Peer Handling ALREADY IMPLEMENTED BY DESIGN:
PeerLog (sync.db) IS the persistent queue
Status: Complete - Connection tracking fully functional
Unblocks: Priority 3 (reconnection sync) - connection tracking ready
Database schema ready (Oct 14, 2025). Priority 2 connection handlers ready (Oct 14, 2025).
Implementation COMPLETE (Oct 15, 2025):
get_watermarks() to query devices table (peer.rs:131-173)
devices.last_state_watermark and devices.last_shared_watermark(Option<DateTime<Utc>>, Option<HLC>)WatermarkExchangeRequest with device watermarksWatermarkExchangeResponse with peer watermarks and catch-up flagsexchange_watermarks_and_catchup() (peer.rs:175-215):
on_watermark_exchange_response() (peer.rs:217-340):
trigger_watermark_exchange() method (peer.rs:728-805):
start_network_event_listener() (peer.rs:611-625):
Files Modified:
core/src/service/network/protocol/sync/messages.rs - added WatermarkExchange messagescore/src/service/sync/peer.rs - full watermark exchange protocol implementationcore/src/service/network/protocol/sync/handler.rs - added request/response handlersStatus: Complete - Reconnection sync automatically triggers on peer connection
Unblocks: Devices staying in sync after offline periods
Implementation (Oct 15, 2025):
Files Modified:
core/src/service/sync/backfill.rs - wired network requests with comprehensive TODO commentsOutstanding Work:
send_sync_request() method to NetworkTransport trait that returns responseCurrent Limitation: Backfill can send requests but can't properly await responses. Protocol handlers work end-to-end for real-time sync, but initial backfill needs request/response correlation.
Unblocks: New devices joining library (once response mechanism implemented)
Unblocks: Better performance and monitoring
core/src/infra/sync/NEW_SYNC.md - Leaderless architecture