packages/sync-core/SPEC.md
This document states the rules that @tldraw/sync-core implements. It is written to drive testing: each rule has a stable ID (e.g. D4, RP7), each rule is independently observable through the public API (or the documented internal API where noted), and the unit tests should be an expression of these rules. When a test and this document disagree, one of them is wrong — figure out which and fix it.
Sections marked internal describe supporting machinery that has its own contract worth testing directly, even though SDK users rarely touch it.
TLSyncRoom, wrapped by TLSocketRoom) is the authoritative server-side copy of a document. The client (TLSyncClient) keeps a local Store synchronized with it.NetworkDiff) is a compact, one-way map of record id to put/patch/remove ops. An object diff (ObjectDiff) is the nested per-key form used inside patch ops.TLSyncForwardDiff) is the storage-level shape { puts, deletes }, where a put is either a full record or a [before, after] tuple.commit (applied verbatim), discard (no effect), or rebaseWithDiff (applied with modifications).PresenceStore, never in document storage.AwaitingConnectMessage → Connected → AwaitingRemoval.getTlsyncProtocolVersion() returns 8.TLSyncErrorCloseEventCode is 4099. It is the WebSocket close code used for fatal, non-retriable sync errors on modern sessions; the close reason carries a TLSyncErrorCloseEventReason string. (Legacy sessions are rejected without it — see SES4.)TLRemoteSyncError has name: 'RemoteSyncError', message: 'sync error: <reason>', and exposes the reason as .reason.diffRecord (D)diffRecord(prev, next) returns an ObjectDiff describing how to turn prev into next, or null when there is nothing to change (including when prev === next).prev but missing from next produces ['delete']. A key present in next but missing from prev produces ['put', value].props and meta are the only nested keys at the top level: changes inside them are expressed as ['patch', ...] ops. Any other top-level key whose values are not both arrays or both strings is compared with deep equality and produces a whole-value ['put', next] on change — even when both values are plain objects.props/meta or deeper), object values are recursively patched; null and primitive values are put.next starts with prev, the diff is ['append', addedSuffix, prev.length]. Other string changes are puts. With legacyAppendMode enabled, string appends become puts instead; array appends (D7) are unaffected by legacyAppendMode.max(length/5, 1) items changed, the op is ['patch', { [index]: op }] where each changed index gets a recursive diff when both old and new items are truthy objects, and a put otherwise. If more items changed, the whole array is put.['append', addedItems, prev.length]. Any change in the shared prefix (including truncation) puts the whole array.applyObjectDiff (AD)applyObjectDiff(object, diff) never mutates its input. It returns the original object (same reference) when no op had an effect, otherwise a shallow-cloned copy with the ops applied. Unchanged nested values keep their identity in the copy.put is applied only when the new value is not deep-equal to the current value.append is applied only when the current value is an array/string of the matching type whose length equals the op's offset. On any mismatch the op is silently ignored.patch is applied only when the current value is a truthy object; it recurses with AD1 semantics. Patching a missing or primitive value is silently ignored.delete removes the key when present.null, primitives) returns the input unchanged.getNetworkDiff(recordsDiff) maps added to put ops, updated to patch ops (computed with diffRecord, entries that compute to no diff are omitted), and removed to remove ops. It returns null when the result would be empty.toNetworkDiff(forwardDiff) maps plain-record puts to put ops, [before, after] puts to patch ops (omitted when diffRecord finds no difference), and deletes to remove ops. Unlike ND1 it always returns an object, possibly empty.diffAndValidateRecord(prev, next, type) returns undefined when the records produce no diff — and in that case does not validate at all.TLSyncError with reason INVALID_RECORD.applyAndDiffRecord(prev, patch, type) applies the patch (AD rules); if the result is reference-identical to prev it returns undefined. Otherwise it returns [actualDiff, newState] where actualDiff is recomputed from prev to the patched state (it may differ from the input patch, e.g. ops that had no effect are dropped). Validation follows RV2.validateRecord(state, type) validates, wrapping throws as TLSyncError/INVALID_RECORD.chunk(msg, maxSize) returns [msg] when msg.length < maxSize (strictly less — a message exactly at maxSize is chunked).<n>_ where n counts down the chunks remaining after this one; the first chunk carries the highest number and the start of the message, and concatenating the chunk bodies in order reconstructs the message.maxSize, except that every chunk carries at least one character of content even when the prefix alone exceeds maxSize.JsonChunkAssembler.handleMessage: input starting with { while idle is parsed immediately and returned as { data, stringified }. Invalid JSON in this case throws synchronously (callers treat a throw as a fatal session error).{ mid-sequence returns { error: 'Unexpected non-chunk message' }; the partial sequence and the JSON message are both discarded and the assembler resets to idle.null until the final (0_) chunk arrives, then the joined body is JSON-parsed and returned; a parse failure is returned as { error }. Either way the assembler resets to idle.{ error: 'Chunks received in wrong order' }. A non-JSON, non-chunk message returns an Invalid chunk error. Both reset to idle.interval (IN)interval(cb, ms) invokes cb every ms milliseconds until the returned dispose function is called.MicrotaskNotifier (MN) — internalnotify(...args) defers delivery to a microtask; each registered listener is called with the notification's arguments, in registration order.register was called, but does receive notifications issued after it (even in the same synchronous block).Set, so registering the same function twice deduplicates: it fires once per notification, and either registration's unsubscribe removes it.These rules hold for both InMemorySyncStorage and SQLiteSyncStorage. The shared test suite runs against both implementations.
DEFAULT_INITIAL_SNAPSHOT: one document record, one page record, clock 0.transaction(callback, opts?) runs callback(txn) synchronously and returns { documentClock, didChange, result, changes? }, where result is the callback's return value.didChange is true exactly when the document clock advanced during the transaction.onChange listeners are notified on a microtask with { id, documentClock }, where id is the transaction's opts.id (undefined when not given). Read-only transactions do not notify. Unsubscribing stops future notifications. An onChange callback passed to the constructor is registered the same way.changes field is populated only when emitChanges: 'always' is passed, and contains the forward diff of everything that changed during the transaction. These implementations apply changes verbatim, so emitChanges: 'when-different' never emits.txn.getClock() returns the clock at transaction start; after the first write it returns the incremented value. The storage-level getClock() always returns the committed clock.txn.get(id) returns the record, or undefined when absent.txn.set(id, record) asserts id === record.id, stores the record with lastChangedClock set to the transaction's clock, and clears any tombstone with that id.txn.delete(id) of an existing record removes it and writes a tombstone at the transaction's clock. Deleting an absent id is a complete no-op: no tombstone, no clock advance.txn.entries()/keys()/values() iterate the documents. Consuming any of these iterators after the transaction has ended throws.txn.getSchema()/setSchema(schema) read and write the persisted serialized schema.txn.getChangesSince(c) returns undefined when c equals the current clock. A c greater than the current clock is treated as -1 (everything changed). The result's wipeAll is true when c < tombstoneHistoryStartsAtClock; in that case puts contains every document and deletes is empty. Otherwise puts contains documents with lastChangedClock > c (strict) and deletes the tombstone ids with clock > c.getSnapshot() returns { documentClock, tombstoneHistoryStartsAtClock, documents, tombstones, schema } reflecting all committed transactions.documentClock ?? clock ?? 0; tombstoneHistoryStartsAtClock ?? documentClock.MAX_TOMBSTONES (5000), it deletes the oldest tombstones — the overflow plus TOMBSTONE_PRUNE_BUFFER_SIZE (1000) more, never splitting a clock value — and advances tombstoneHistoryStartsAtClock to the oldest surviving tombstone's clock.InMemorySyncStorage specifics (IM)lastChangedClock/tombstone clock found in the snapshot, even when the snapshot's documentClock is lower.tombstoneHistoryStartsAtClock is clamped down to the document clock.tombstoneHistoryStartsAtClock equals the document clock (no usable history).set.createTLSchema().serializeEarliestVersion()).SQLiteSyncStorage specifics (SQ)documents, tombstones, and metadata tables, honoring the wrapper's tablePrefix. Data persists across re-instantiation over the same database.DEFAULT_INITIAL_SNAPSHOT. A StoreSnapshot is accepted and converted (SL1).documentClock and tombstoneHistoryStartsAtClock are taken verbatim (no clamping against document/tombstone clocks), and tombstones are loaded even when the history window is empty.SQLiteSyncStorage.hasBeenInitialized(wrapper) is true exactly when the metadata table exists and holds a non-empty schema string; it respects the table prefix and returns false (rather than throwing) on a missing table.SQLiteSyncStorage.getDocumentClock(wrapper) returns the persisted clock, or null when storage is uninitialized.computeTombstonePruning (TP) — internalnull when the tombstone count is at or below maxTombstones.pruneBufferSize + count − maxTombstones and is extended forward while it would split tombstones sharing a clock value; idsToDelete is the first cutoff entries (input must be sorted by clock ascending).newTombstoneHistoryStartsAtClock is the clock of the oldest surviving tombstone, or documentClock when everything is deleted.NodeSqliteWrapper passes exec and prepare through to the underlying database (node:sqlite DatabaseSync or better-sqlite3); prepared statements support all, iterate, and run with bindings and can be reused.NodeSqliteWrapper.transaction wraps the callback in BEGIN/COMMIT, returns the callback's result, and on a throw issues ROLLBACK and rethrows the original error.DurableObjectSqliteSyncWrapper.prepare returns a statement that re-executes the stored SQL with the given bindings on every iterate/all/run call; exec forwards to sql.exec; transaction delegates to the Durable Object's transactionSync.convertStoreSnapshotToRoomSnapshot passes a RoomSnapshot (anything with a documents key) through by reference; a StoreSnapshot becomes a room snapshot with clock 0, every document at lastChangedClock 0, and no tombstones.loadSnapshotIntoStorage requires the (converted) snapshot to have a schema and throws otherwise.lastChangedClock is preserved); changed and new documents are written; stored documents absent from the snapshot are deleted (tombstoned).schema.migrateStorage then migrates all loaded records up to the room's current schema version.ServerSocketAdapter (SA)isOpen is true exactly when the wrapped socket's readyState is 1.sendMessage(msg) JSON-stringifies the message and sends it; when configured, onBeforeSendMessage(msg, stringified) is invoked before the send.close(code?, reason?) passes through to the wrapped socket.ClientWebSocketAdapter (CW)getUri function (sync or async); a connection attempt starts immediately. http(s) URIs are converted to ws(s). getUri is re-invoked for every attempt, so it can return fresh auth tokens.connectionStatus starts as 'offline'; when the socket opens it becomes 'online' and status listeners receive { status: 'online' }.TLSyncErrorCloseEventCode (4099) produces status 'error' with the close reason (or 'UNKNOWN_ERROR' when empty); all other closes and socket errors produce 'offline'.'error' arriving while already 'offline' is suppressed.sendMessage: when online, the message is JSON-stringified, chunked (CH1–CH3), and each chunk sent; when a socket exists but is not online, the message is dropped with a console warning; with no socket it is silently dropped. After close() it throws.onReceiveMessage listeners; listeners can unsubscribe.restart() closes the current socket (notifying 'offline') and starts a reconnection attempt.close() disposes the adapter: restart, sendMessage, and listener registration then throw; close itself is idempotent.restart/offline handling) are ignored — they cannot change status.ReconnectManager (RM) — internalACTIVE_MIN_DELAY (500ms) and ACTIVE_MAX_DELAY (2s) when the tab is visible, and INACTIVE_MIN_DELAY (1s) and INACTIVE_MAX_DELAY (5min) when hidden.ACTIVE_MIN_DELAY.offline event closes the active socket (which triggers the reconnect cycle).online, the document becoming visible) call maybeReconnected: a socket that is OPEN is left alone; one that is CONNECTING for less than ATTEMPT_TIMEOUT (1s) is rechecked later; one CONNECTING for longer is closed and retried; otherwise the backoff is reset and a reconnect attempt happens immediately (honoring the minimum delay).close() cancels all timers and event listeners.TLSyncClient — connection lifecycle (CL)'online'.connectRequestId, the store's serialized schema, protocol version 8, and lastServerClock (−1 before any server contact, afterwards the last seen server clock).onLoad fires on the first message received from the server, of any type.connect response whose connectRequestId does not match the latest request is ignored.connect response with hydrationType: 'wipe_presence', the client reverts its speculative changes, removes all presence records, applies the server's diff, then re-applies the speculative changes on top and pushes them as a new push request.hydrationType: 'wipe_all', all document records are additionally wiped before the server's diff is applied; speculative changes still re-apply on top afterwards.onAfterConnect is called with { isReadonly } from the connect message, and the current presence state (if any) is pushed.'offline', the client resets: presence records are removed from the store, pending and unsent pushes are dropped, and the client waits to reconnect. When the socket reports 'error', onSyncError(reason) fires and the client closes permanently.pong (or any server message) refreshes the server-interaction timestamp.didCancel is provided and returns true, the next event causes the client to close instead of processing.close() disposes all listeners and timers and removes the window.tlsync debugging reference (which construction installs).incompatibility_error server message is legacy: it is logged as an error and otherwise ignored.TLSyncClient — pushing changes (CP)'user' and scope 'document' are folded into the speculative diff immediately. Remote and non-document changes are not pushed.clientClock.['put', record], subsequent sends are ['patch', diff] against the last pushed state; an unchanged presence sends nothing; document and presence changes ride in the same push request when both are pending. After a reconnect, presence is re-put in full.'solo' presence mode the presence signal is not pushed at all (document changes still are); the network throttle drops from 30fps to 1fps.TLSyncClient — receiving and rebasing (CR)data, patch, and push_result events are buffered and processed on the throttle; they are dropped entirely when the client is not connected to the room.push_result must match the oldest pending push's clientClock: commit applies that push's diff as confirmed; discard drops it; rebaseWithDiff applies the server's modified diff instead.push_result with no pending pushes, or with a mismatched clientClock, is an error: the store is checked for usability and the connection resets.lastServerClock advances to the last buffered event's serverClock and is used for the next reconnect.put equal to the stored record is a no-op for listeners, a patch for a missing record is skipped, and a remove of a missing record is skipped.custom events invoke onCustomMessageReceived(data) with this bound to null.TLSyncRoom — construction (RC)undefined values).presence-scoped type throws at construction.schema.migrateStorage in a storage transaction, migrating any pre-existing storage data up to the room's schema version. Re-running on already-migrated data changes nothing.onChange notifications carrying a foreign transaction id make the room broadcast the new changes to all connected clients. The room's own transactions (id 'TLSyncRoom.txn') do not re-broadcast this way.wipeAll), the room closes every session so clients reconnect and re-hydrate.SESSION_IDLE_TIMEOUT (20s) and is configurable via clientTimeout. A finite positive timeout starts a periodic prune interval of min(2000, timeout/4) ms; Infinity or 0 disables the interval (pruning then only happens on message activity or via the follow-up prune scheduled when a socket close/error cancels a session, per SES2).close() closes every session's socket and stops background work; isClosed() reports it.TLSyncRoom — connect handshake (HS)handleNewSession registers the session in AwaitingConnectMessage state and assigns a presence id; a session re-registered under the same id keeps its previous presence id.requiresLegacyRejection (6's close protocol), 8 natively. Every version below 8 — including 7 — is accepted with supportsStringAppend: false. Anything below 5 (or missing) is rejected CLIENT_TOO_OLD; anything above 8 is rejected SERVER_TOO_OLD.CLIENT_TOO_OLD.connectRequestId and isReadonly, carries the server's schema and current clock, and hydrationType: 'wipe_all' when storage cannot produce an incremental diff since the client's lastServerClock (including when that clock is in the future), else 'wipe_presence'.lastServerClock (the full document set in the wipe_all case), all down-migrated when the client's schema is older.Connected.TLSyncRoom — push handling (RP)Connected are ignored.put whose record type is not a known document type rejects the session with INVALID_RECORD. A record that fails schema validation also rejects with INVALID_RECORD (delivered per SES4: code 4099 for modern sessions, the legacy message for protocol ≤ 6).put of a new id stores the record and broadcasts a put; a put over an existing record stores the new state but broadcasts a patch containing only the computed difference; a put equal to the stored record changes nothing.patch for a missing (e.g. concurrently deleted) record is silently ignored.patch is applied to the stored record (AD rules) and the result validated; the broadcast patch is the recomputed effective diff, not the client's input.remove of an existing record deletes it (creating a tombstone) and broadcasts the removal; a remove of a missing record is ignored.commit when the applied network diff equals the requested document diff exactly (a presence-only push therefore commits), discard when a requested document diff produced no change, and { rebaseWithDiff } carrying the actual (per-session-migrated) diff otherwise.patch to every other connected session, never echoed to the pusher (who gets the push result instead).discard); the presence portion is still processed.id and typeName forced server-side; they are broadcast to other sessions and never touch document storage or the document clock. onPresenceChange fires on a microtask after presence changes.lastInteractionTime; a ping is answered with pong.TLSyncError thrown while handling a message (validation, migration) rejects only that session with the error's reason.emitChanges: 'when-different'), the room broadcasts the storage's actual changes and the pusher receives them as rebaseWithDiff. (The built-in storages never do this; the rule is observable with a custom storage.)TLSyncRoom — messaging and broadcast (RB)patch, push_result) to a session are debounced: the first is sent immediately wrapped as { type: 'data', data: [msg] }; messages within the following DATA_MESSAGE_DEBOUNCE_INTERVAL (1000/60 ms) are buffered and flushed together as one data message. The array handed to the socket is not mutated afterwards, so sockets may serialize lazily.pong, which skips the flush.sendCustomMessage delivers { type: 'custom', data } to a connected session; sending to an unknown or not-yet-connected session logs a warning and does nothing.TLSyncRoom — session lifecycle (SES)pruneSessions: a Connected session idle longer than the idle timeout, or whose socket is closed, is cancelled. An AwaitingConnectMessage session older than SESSION_START_WAIT_TIME (10s), or whose socket is closed, is removed immediately. An AwaitingRemoval session older than SESSION_REMOVAL_WAIT_TIME (5s) is removed.handleClose) moves the session to AwaitingRemoval — keeping its presence id and meta for a quick reconnect — closes the socket, and schedules a follow-up prune.session_removed, and emits room_became_empty when it was the last session.rejectSession with a reason: legacy sessions (protocol ≤ 6) receive a deprecated incompatibility_error message (reason mapped: CLIENT_TOO_OLD → clientTooOld, SERVER_TOO_OLD → serverTooOld, INVALID_RECORD → invalidRecord, anything else → invalidOperation) and are then removed without a close code; modern sessions are closed with code 4099 and the reason string. Without a reason it is a plain removal.getCanEmitStringAppend() is false when any connected session has supportsStringAppend: false; pushes handled in that state use legacy append mode (D5) so broadcast diffs avoid string-append ops.handleResumedSession registers a session directly in Connected state (no handshake): requiresDownMigrations is recomputed from the supplied schema, and a supplied presence record is restored into the presence store.[before, after] updates migrate both sides and send the re-computed patch between the migrated versions; deletes pass through. A failure rejects the session CLIENT_TOO_OLD.CLIENT_TOO_OLD.TLSocketRoom (SR)storage and initialSnapshot throws. With neither, an InMemorySyncStorage seeded from DEFAULT_INITIAL_SNAPSHOT is created; initialSnapshot (deprecated) accepts both room and store snapshots.onDataChange callback is wired to storage.onChange (fires on a microtask after document changes, including programmatic ones).log defaults to { error: console.error } only when the log key is absent from the options object; an explicitly passed log: undefined leaves the room without a logger.handleSocketConnect registers the session (readonly defaults to false), attaches message/close/error listeners when the socket supports addEventListener, and creates a chunk assembler for the session.handleSocketMessage assembles chunks (CH rules), then for each complete message: invokes onAfterReceiveMessage, forwards to the room, and runs a prune pass (after handling, so a session is never evicted by its own message). Assembly errors close the socket via the error path; a thrown exception rejects the session with UNKNOWN_ERROR.handleSocketError and handleSocketClose cancel the session (grace period applies per SES2) and clear any pending session-snapshot timer.getNumActiveSessions() counts all sessions including those awaiting connect/removal; getSessions() reports { sessionId, isConnected, isReadonly, meta }.getRecord(id) returns a deep clone of the stored record (safe to mutate), or undefined.getCurrentDocumentClock() returns the storage clock; getCurrentSnapshot() delegates to storage.getSnapshot and throws when the storage doesn't support it.loadSnapshot replaces the room contents per SL2–SL4 inside one transaction; connected clients receive the resulting changes (or are force-reconnected on a wipe per RC5).close() closes the room, clears session-snapshot timers, and disposes subscriptions; closeSession(sessionId, fatalReason?) behaves as SES4.onSessionSnapshot configured, a session snapshot is delivered 5 seconds after that session's last message; further messages reset the timer; socket close/error cancels it.getSessionSnapshot returns null unless the session is Connected; the snapshot carries the serialized schema, readonly flag, legacy/append flags, presence id, and the presence record with large fields stripped (scribbles: [], chatMessage: '', selectedShapeIds: [], brush: null).handleSocketResume restores a session from a snapshot directly into Connected state without attaching socket listeners (hibernation environments deliver events via methods); the resumed session handles pings and pushes normally.getPresenceRecords() returns a map of presence id to presence record for every presence currently in the room.updateStore (US) — deprecatedget returns a deep clone (mutating it does nothing without put), null for missing or deleted ids.put validates the record against the schema (unknown types and invalid records throw). A put deep-equal to the snapshot state cancels any pending put for that id; any put clears a pending delete.delete accepts a record or id, cancels any pending put, and records a delete only for ids that exist in the snapshot.getAll returns pending puts plus all snapshot records that are neither deleted nor shadowed, as clones.onChange fires only when something actually changed.'StoreUpdateContext is closed'); updateStore on a closed room rejects.