Path Locks and Crash Recovery

OpenViking uses two simple primitives — path locks and redo log — to protect the consistency of core write operations (rm, mv, add_resource, session.commit), ensuring that VikingFS, VectorDB, and QueueManager remain consistent even when failures occur.

Design Philosophy

OpenViking is a context database where FS is the source of truth and VectorDB is a derived index. A lost index can be rebuilt from source data, but lost source data is unrecoverable. Therefore:

Better to miss a search result than to return a bad one.

Design Principles

Write-exclusive: Path locks ensure only one write operation can operate on a path at a time
On by default: All data operations automatically acquire locks; no extra configuration needed
Lock as protection: LockContext acquires locks on entry, releases on exit — no undo/journal/commit semantics
Only session_memory needs crash recovery: RedoLog re-executes memory extraction after a process crash
Queue operations run outside locks: SemanticQueue/EmbeddingQueue enqueue operations are idempotent and retriable

Architecture

Service Layer (rm / mv / add_resource / session.commit)
    |
    v
+--[LockContext async context manager]--+
|                                       |
|  1. Create LockHandle                 |
|  2. Acquire path lock (poll+timeout)  |
|  3. Execute operations (FS+VectorDB)  |
|  4. Release lock                      |
|                                       |
|  On exception: auto-release lock,     |
|  exception propagates unchanged       |
+---------------------------------------+
    |
    v
Storage Layer (VikingFS, VectorDB, QueueManager)

Two Core Components

Component 1: PathLockEngine + LockManager + LockContext (Path Lock System)

PathLockEngine implements file-based distributed locks with two lock types — EXACT and TREE — using fencing tokens to prevent TOCTOU races and automatic stale lock detection and cleanup.

LockHandle is a lightweight lock holder token:

python

@dataclass
class LockHandle:
    id: str          # Unique ID used to generate fencing tokens
    locks: list[str] # Acquired lock file paths
    created_at: float # Handle creation time
    last_active_at: float # Last successful acquire/refresh time

LockManager is a global singleton managing lock lifecycle:

Creates/releases LockHandles
Background cleanup of leaked locks (in-process safety net)
Executes RedoLog recovery on startup

LockContext is an async context manager encapsulating the lock/unlock lifecycle:

python

from openviking.storage.transaction import LockContext, get_lock_manager

async with LockContext(get_lock_manager(), [path], lock_mode="exact") as handle:
    # Perform operations under lock protection
    ...
# Lock automatically released on exit (including exceptions)

Component 2: RedoLog (Crash Recovery)

Used only for the memory extraction phase of session.commit. Writes a marker before the operation, deletes it after success, and scans for leftover markers on startup to redo.

/local/_system/redo/{task_id}/redo.json

Memory extraction is idempotent — re-extracting from the same archive produces the same result.

Consistency Issues and Solutions

rm(uri)

Problem	Solution
Delete file first, then index -> file gone but index remains -> search returns non-existent file	Reverse order: delete index first, then file. Index deletion failure -> both file and index intact

Locking strategy (depends on target type):

Deleting a directory: lock_mode="tree", locks the directory and its subtree
Deleting a file: lock_mode="exact", locks the file path itself

Operation flow:

1. Check whether target is a directory or file, choose lock mode
2. Acquire lock
3. Delete VectorDB index -> immediately invisible to search
4. Delete FS file
5. Release lock

VectorDB deletion fails -> exception thrown, lock auto-released, file and index both intact. FS deletion fails -> VectorDB already deleted but file remains, retry is safe.

mv(old_uri, new_uri)

Problem	Solution
File moved to new path but index points to old path -> search returns old path (doesn't exist)	Copy first then update index; clean up copy on failure

Locking strategy (handled automatically via lock_mode="mv"):

Moving a directory: TreeLock on the source path and ExactPathLock on the destination path
Moving a file: EXACT lock on both source path and destination path

Operation flow:

1. Check whether source is a directory or file, set src_is_dir
2. Acquire mv lock (internally chooses TreeLock or ExactPathLock based on src_is_dir)
3. Copy to new location (source still intact, safe)
4. If directory, remove the lock file carried over by cp into the copy
5. Update VectorDB URIs
   - Failure -> clean up copy, source and old index intact, consistent state
6. Delete source
7. Release lock

add_resource

Problem	Solution
File moved from temp to final directory, then crash -> file exists but never searchable	Two separate paths for first-time add vs incremental update
Resource already on disk but rm deletes it while semantic processing / vectorization is still running -> wasted work	Lifecycle TreeLock held from finalization through processing completion

First-time add (target does not exist) — handled in ResourceProcessor.process_resource Phase 3.5:

1. Acquire TreeLock on final_uri
   - If final_uri does not exist, check ancestor/descendant/same-path conflicts first
   - If there is no conflict, create final_uri and write final_uri/.path.ovlock as a T lock
2. Keep temp as the source directory and enqueue SemanticMsg(uri=temp, target_uri=final_uri, lifecycle_lock_handle_id=...)
3. DAG runs on temp and syncs temp content into final_uri after completion
   - Do not use raw agfs.mv(temp -> final_uri), because final_uri already exists for the lock file
4. Clean up temp directory
5. DAG starts lock refresh loop (refreshes the lock token and updates handle activity every lock_expire/2 seconds)
6. DAG complete + all embeddings done -> release TreeLock

If summarization and indexing are both disabled, no downstream DAG takes over. In that case ResourceProcessor copies temp directory content into final_uri under the same TreeLock, deletes temp, then releases the lock. It does not call VikingFS.mv(temp, final_uri, lock_handle=handle), because move cleanup can remove the directory lock file.

During this period, rm attempting to acquire a TreeLock on the same path will fail with ResourceBusyError.

Incremental update (target already exists) — temp stays in place:

1. Acquire TreeLock on target_uri (protect existing resource)
2. Enqueue SemanticMsg(uri=temp, target_uri=final, lifecycle_lock_handle_id=...)
3. DAG runs on temp, lock refresh loop active
4. DAG completion triggers sync_diff_callback or move_temp_to_target_callback
5. Callback completes -> release TreeLock

Note: DAG callbacks do NOT wrap operations in an outer lock. Each VikingFS.rm and VikingFS.mv has its own lock internally. An outer lock would conflict with these inner locks causing deadlock.

Both first-time add and incremental update hold only TreeLock(resource_dir). There is no ExactPathLock(resource_dir) -> TreeLock(resource_dir) handoff, so the two modes cannot accidentally release the same .path.ovlock file in the wrong scope.

Automatic naming is handled by the resource layer, not the lock service: ResourceProcessor checks exists(candidate_uri) first; occupied candidates try _1, _2, and so on. Only a non-existing candidate attempts TreeLock, without waiting. If that candidate is busy, the next suffix is tried.

Server restart recovery: SemanticMsg is persisted in QueueFS. On restart, SemanticProcessor detects that the lifecycle_lock_handle_id handle is missing from the in-memory LockManager and re-acquires a TreeLock.

Derived Semantic Files (.abstract.md / .overview.md)

.abstract.md and .overview.md are generated sidecar files, not regular user source files. Their concurrency protection has two layers:

Problem	Solution
Multiple background tasks refresh the same directory summary and an old result overwrites a newer one	Messages for the same dirty key use `coalesce_version`; only the latest version may write back
Two latest-stage writes interleave on the sidecar files	Acquire ExactPathLock on `.abstract.md` and `.overview.md` before writing

Example: concurrent writes to docs/a.md, docs/b.md, and docs/c.md hold separate ExactPathLocks and do not block each other. Background refresh may start multiple docs/ summary tasks, but only the latest version writes docs/.overview.md and docs/.abstract.md; stale tasks drop their results before writeback.

Memory directory summaries use the same rule. Concurrent writes to:

text

viking://user/default/memories/preferences/theme.md
viking://user/default/memories/preferences/editor.md

hold separate ExactPathLocks for the two source files. Refreshing preferences/.overview.md and preferences/.abstract.md no longer needs a long TreeLock; stale background tasks are filtered by coalesce_version, and final sidecar writes briefly acquire ExactPathLock.

session.commit()

Problem	Solution
Messages cleared but archive not written -> conversation data lost	Phase 1 without lock (incomplete archive has no side effects) + Phase 2 with RedoLog

LLM calls have unpredictable latency (5s~60s+) and cannot be inside a lock-holding operation. The design splits into two phases:

Phase 1 — Archive (no lock):
  1. Generate archive summary (LLM)
  2. Write archive (history/archive_N/messages.jsonl + summaries)
  3. Clear messages.jsonl
  4. Clear in-memory message list

Phase 2 — Memory extraction + write (RedoLog):
  1. Write redo marker (archive_uri, session_uri, user identity)
  2. Extract memories from archived messages (LLM)
  3. Write current message state
  4. Write relations
  5. Directly enqueue SemanticQueue
  6. Delete redo marker

Crash recovery analysis:

Failure moment	State	Recovery action
During Phase 1 archive write	No marker	Incomplete archive; next commit scans history/ for index, unaffected
Phase 1 archive complete but messages not cleared	No marker	Archive complete + messages still present = redundant but safe
During Phase 2 memory extraction/write	Redo marker exists	On startup: redo extraction + write + enqueue from archive
Phase 2 complete	Redo marker deleted	No recovery needed

LockContext

LockContext is an async context manager that encapsulates lock acquisition and release:

python

from openviking.storage.transaction import LockContext, get_lock_manager

lock_manager = get_lock_manager()

# Exact lock (write operations, semantic processing)
async with LockContext(lock_manager, [path], lock_mode="exact"):
    # Perform operations...
    pass

# Tree lock (directory delete and lifecycle protection)
async with LockContext(lock_manager, [path], lock_mode="tree"):
    # Perform operations...
    pass

# MV lock (move operations)
async with LockContext(lock_manager, [src], lock_mode="mv", mv_dst_path=dst):
    # Perform operations...
    pass

Lock modes:

lock_mode	Use case	Behavior
`exact`	File writes, single-file delete, sidecar writeback	Lock the specified path; conflicts with same-path locks and ancestor TreeLocks
`tree`	Directory delete, resource lifecycle, directory-level protection	Lock the subtree root; conflicts with same-path locks, descendant locks, and ancestor TreeLocks
`mv`	Move operations	Directory move: source TreeLock + destination ExactPathLock; File move: ExactPathLock on both source and destination (controlled by `src_is_dir`)

Exception handling: __aexit__ always releases locks and does not swallow exceptions. Lock acquisition failure raises LockAcquisitionError.

Lock Types (EXACT vs TREE)

The lock mechanism uses two lock types to handle different conflict patterns:

	EXACT on same path	TREE on same path	EXACT on descendant	TREE on ancestor
EXACT	Conflict	Conflict	—	Conflict
TREE	Conflict	Conflict	Conflict	Conflict

EXACT (E): Locks one concrete path. It can protect files, directory names, and not-yet-created target paths. Blocks if any ancestor holds a TreeLock.
TREE (T): Used for directory delete, directory move, resource lifecycle protection, and similar subtree-level operations. Logically covers the entire subtree but only writes one lock file at the root. Before acquiring, scans all descendants and ancestor directories for conflicting locks. If the target directory is missing, conflicts are checked first; only then is the directory created and locked. If a later double-check finds a new conflict, the acquire fails or retries without rolling back the empty directory.

Lock Mechanism

Lock Protocol

Lock file paths:

text

TreeLock(path)                  -> {path}/.path.ovlock
ExactPathLock(existing dir path) -> {path}/.path.ovlock
ExactPathLock(file or missing path) -> {parent}/.exact.ovlock.<name>.<hash>

Lock file content (Fencing Token):

{handle_id}:{time_ns}:{lock_type}

Where lock_type is E (EXACT) or T (TREE).

Lock Acquisition (EXACT mode)

loop until timeout (poll interval: 200ms):
    1. Check if target path is locked by another operation
       - Stale lock? -> remove and retry
       - Active lock? -> wait
    2. Check all ancestor directories for TREE locks
       - Stale lock? -> remove and retry
       - Active lock? -> wait
    3. Ensure the lock file's parent directory exists; create it if missing
    4. Write EXACT (E) lock file
    5. TOCTOU double-check: re-scan target path and ancestors for TREE locks
       - Conflict found: compare (timestamp, handle_id)
       - Later one (larger timestamp/handle_id) backs off (removes own lock) to prevent livelock
       - Wait and retry
    6. Verify lock file ownership (fencing token matches)
    7. Success

Timeout (default 0 = no-wait) raises LockAcquisitionError

Lock Acquisition (TREE mode)

loop until timeout (poll interval: 200ms):
    1. Check if target directory is locked by another operation
       - Stale lock? -> remove and retry
       - Active lock? -> wait
    2. Check all ancestor directories for TREE locks
       - Stale lock? -> remove and retry
       - Active lock? -> wait
    3. Scan all descendant directories for any locks by other operations
       - Missing target directory? -> treat as no descendant locks
       - Stale lock? -> remove and retry
       - Active lock? -> wait
    4. Ensure the target directory exists; create it if missing
    5. Write TREE (T) lock file (only one file, at the root path)
    6. TOCTOU double-check: re-scan descendants and ancestors
       - Conflict found: compare (timestamp, handle_id)
       - Later one (larger timestamp/handle_id) backs off (removes own lock) to prevent livelock
       - Wait and retry
    7. Verify lock file ownership (fencing token matches)
    8. Success

Timeout (default 0 = no-wait) raises LockAcquisitionError

Missing Directory Creation

The lock system may create directories so it can place lock files, but it checks for conflicts first:

1. Ancestor TreeLock / same-path lock / descendant lock conflict -> do not create the directory
2. No current conflict -> create the directory and write the lock
3. A post-write double-check finds a new conflict -> remove our own lock and fail or retry
4. Step 3 does not roll back the empty directory

Lock Expiry Cleanup

Stale lock detection: PathLockEngine checks the fencing token timestamp. Locks older than lock_expire (default 300s) are considered stale and are removed automatically during acquisition.

In-process cleanup: LockManager checks active LockHandles every 60 seconds. Handles that still own lock files but have been inactive for longer than lock_expire are force-released.

Orphan locks: Lock files left behind after a process crash are automatically removed via stale lock detection when any operation next attempts to acquire a lock on the same path.

Crash Recovery

LockManager.start() automatically scans for leftover markers in /local/_system/redo/ on startup:

Scenario	Recovery action
session_memory extraction crash	Redo memory extraction + write + enqueue from archive
Crash while holding lock	Lock file remains in AGFS; stale detection auto-cleans on next acquisition (default 300s expiry)
Crash after enqueue, before worker processes	QueueFS SQLite persistence; worker auto-pulls after restart
Orphan index	Cleaned on L2 on-demand load

Defense Summary

Failure scenario	Defense	Recovery timing
Crash during operation	Lock auto-expires + stale detection	Next acquisition of same path lock
Crash during add_resource semantic processing	Lifecycle lock expires + SemanticProcessor re-acquires on restart	Worker restart
Crash during session.commit Phase 2	RedoLog marker + redo	On restart
Crash after enqueue, before worker	QueueFS SQLite persistence	Worker restart
Orphan index	L2 on-demand load cleanup	When user accesses

Configuration

Path locks are enabled by default with no extra configuration needed. The default behavior is no-wait: if the path is locked, LockAcquisitionError is raised immediately. To allow wait/retry, configure the storage.transaction section:

json

{
  "storage": {
    "transaction": {
      "lock_timeout": 5.0,
      "lock_expire": 300.0
    }
  }
}

Parameter	Type	Description	Default
`lock_timeout`	float	Lock acquisition timeout (seconds). `0` = fail immediately if locked (default). `> 0` = wait/retry up to this many seconds.	`0.0`
`lock_expire`	float	Lock inactivity threshold (seconds). Locks not refreshed within this window are treated as stale and reclaimed.	`300.0`

QueueFS Persistence

The lock mechanism relies on QueueFS using the SQLite backend to ensure enqueued tasks survive process restarts. This is the default configuration and requires no manual setup.

Architecture - System architecture overview
Storage - AGFS and vector store
Session Management - Session and memory management
Configuration - Configuration reference

Path Locks and Crash Recovery

Path Locks and Crash Recovery

Design Philosophy

Design Principles

Architecture

Two Core Components

Component 1: PathLockEngine + LockManager + LockContext (Path Lock System)

Component 2: RedoLog (Crash Recovery)

Consistency Issues and Solutions

rm(uri)

mv(old_uri, new_uri)

add_resource

Derived Semantic Files (.abstract.md / .overview.md)

session.commit()

LockContext

Lock Types (EXACT vs TREE)

Lock Mechanism

Lock Protocol

Lock Acquisition (EXACT mode)

Lock Acquisition (TREE mode)

Missing Directory Creation

Lock Expiry Cleanup

Crash Recovery

Defense Summary

Configuration

QueueFS Persistence

Related Documentation