Back to Ruflo

ADR-077: DiskANN Vector Search Backend

v3/implementation/adrs/ADR-077-diskann-vector-backend.md

3.6.303.9 KB
Original Source

ADR-077: DiskANN Vector Search Backend

Status: Implemented Date: 2026-04-07 Branch: feat/diskann-vector-backend

Context

ruflo's vector search uses three backends with different tradeoffs. With @ruvector/[email protected] now published (5-platform native binaries), we have a Vamana graph-based SSD-friendly alternative to HNSW.

Benchmark Results (measured, not theoretical)

1,000 vectors, 384 dims, k=10, 100 queries

BackendInsertBuildSearchQPSRecall@10Memory
DiskANN0.57ms1,324ms16.5ms6,0481.000-1.1MB*
HNSW4,662ms0ms12.7ms7,8500.1200.5MB
Cosine-JS0.89ms0ms64.6ms1,5481.0000.4MB

5,000 vectors, 384 dims, k=10, 50 queries

BackendInsertBuildSearchQPSRecall@10Memory
DiskANN2.12ms15,955ms20ms2,5010.8740.9MB
HNSW24,614ms0ms8.9ms5,6360.0261.0MB
Cosine-JS6.84ms0ms155ms3231.0001.0MB

*Negative memory = GC reclaimed during benchmark

Analysis

  • DiskANN: Perfect recall at 1K vectors (1.000), strong at 5K (0.874). Insert is 8,000x faster than HNSW. Build step is expensive (1-16s) but only needed once. QPS competitive.
  • HNSW (@ruvector/router): Fastest search but very low recall (0.12 at 1K, 0.026 at 5K) — the score-as-distance inversion bug may still affect recall measurement. Very slow insert (4.6s for 1K).
  • Cosine-JS: Perfect recall (brute force) but slowest search. Best for small datasets (<500 vectors).

Decision

Add @ruvector/diskann as an optional backend with automatic fallback:

DiskANN (native, Vamana graph) → HNSW (@ruvector/router) → Cosine-JS (pure JS)

Selection criteria

Dataset sizeRecommended backendReason
< 500 vectorsCosine-JSPerfect recall, fast enough
500 - 50K vectorsDiskANNHigh recall + reasonable QPS
> 50K vectorsDiskANN with PQSSD-friendly, sub-linear memory

Implementation

Files

  • v3/@claude-flow/cli/src/ruvector/diskann-backend.ts — unified backend with auto-selection, fallback chain, benchmark utility

API

typescript
import { insertVector, searchVectors, buildIndex, benchmark } from './ruvector/diskann-backend.js';

// Insert vectors
await insertVector('doc-1', embedding, { dim: 384 });
await insertVector('doc-2', embedding2, { dim: 384 });

// Build index (required for DiskANN)
await buildIndex({ dim: 384 });

// Search
const results = await searchVectors(queryEmbedding, 10, { dim: 384 });
// → [{ id: 'doc-1', distance: 0.02, score: 0.98 }, ...]

// Benchmark
const bench = await benchmark({ dim: 384, vectorCount: 1000, k: 10 });

DiskANN-specific features

  • Vamana graph: Bounded-degree directed graph optimized for SSD access patterns
  • Product Quantization: Optional pqSubspaces parameter for memory compression
  • Disk persistence: save(dir) / DiskAnn.load(dir) for persistent indexes
  • Batch insert: insertBatch() for bulk loading
  • Async search: searchAsync() for non-blocking queries

Consequences

Positive

  • DiskANN provides perfect recall at 1K vectors and 87%+ at 5K
  • Insert is 8,000x faster than HNSW (0.57ms vs 4,662ms for 1K vectors)
  • Native disk persistence — no rebuild needed between sessions
  • Product quantization enables billion-scale with bounded memory
  • Graceful fallback chain: DiskANN → HNSW → Cosine-JS

Negative

  • Build step required before search (1-16s depending on dataset size)
  • Native binary dependency (5 platforms, optional)
  • Recall degrades slightly at scale (0.874 at 5K) vs brute-force (1.000)

Neutral

  • Memory usage comparable across all three backends at 5K vectors (~1MB)
  • HNSW recall issue may be a measurement artifact from distance/similarity inversion