Back to Spacedrive

Epic: Hybrid Indexing Engine

.tasks/core/INDEX-000-indexing-file-management.md

0.4.32.1 KB
Original Source

Description

The hybrid indexing engine is Spacedrive's core filesystem discovery and processing system. It layers an ultra-fast, in-memory ephemeral index over a robust SQLite-backed persistent index, enabling instant browsing of unmanaged locations (like a file manager) while seamlessly upgrading paths to managed libraries (like a DAM) without UI flicker.

Architecture

  • Ephemeral Layer: Memory-resident index for instant browsing of external drives and unmanaged paths
  • Persistent Layer: SQLite-backed index with full change tracking, sync, and content analysis
  • Five-Phase Pipeline: Discovery → Processing → Aggregation → Content Identification → Finalizing
  • Change Detection: Dual-mode system with batch ChangeDetector and real-time ChangeHandler trait
  • Database Architecture: Closure tables for O(1) hierarchy queries and directory path caching

Key Features

  • Instant browsing of millions of files in RAM (~50 bytes per entry)
  • Seamless promotion from ephemeral to persistent with UUID preservation
  • Multi-phase indexing with resumable jobs
  • Real-time filesystem watching via unified ChangeHandler
  • Intelligent rules engine with .gitignore integration
  • Index verification and integrity checking
  • Bidirectional UUID reconciliation across ephemeral and persistent layers
  • Rules-free scan mode for operations requiring complete filesystem coverage

Child Tasks

  • INDEX-001: Hybrid Indexing Architecture - Done
  • INDEX-002: Five-Phase Indexing Pipeline - Done
  • INDEX-003: Database Architecture - Done
  • INDEX-004: Change Detection System - Done
  • INDEX-005: Indexer Rules Engine - Done
  • INDEX-006: Data Structures & Memory Optimizations - Done
  • INDEX-007: Index Verification System - Done
  • INDEX-008: Nested Locations Support - To Do
  • INDEX-009: Stale File Detection - To Do
  • INDEX-010: Bidirectional UUID Reconciliation - To Do (Critical, blocks FSYNC-003)
  • INDEX-011: Rules-Free Ephemeral Scan Mode - To Do (blocks FSYNC-003, FILE-006)
  • INDEX-012: Ephemeral Cache Parent Path Deduplication - To Do (correctness fix)