Vector Search

Two distinct vector-search paths live in this plugin. Pick the right one — they're not interchangeable.

Path	Tool family	Backing	Capacity	Latency
Large-scale corpus	`embeddings_*`	`@claude-flow/memory` HNSW (Rust/Native)	up to millions of vectors	150×–12,500× faster than brute-force, depending on N and parameters
Hot-path router	`ruvllm_hnsw_*`	WASM-backed router (v2.0.1)	~11 patterns max (`ruvllm-tools.ts:58`)	sub-ms; designed for high-priority routing, not corpus search

The "12,500×" headline applies to the large-scale embeddings_search path. The WASM router is not that path.

When to use

Need	Path
Search a corpus of N ≥ 500 documents	`embeddings_search`
Memory-constrained corpus (≥5,000 vectors)	RaBitQ quantized — see "Quantized search" below
Compare two strings	`embeddings_compare`
Hierarchical / taxonomic data	`embeddings_hyperbolic` (Poincare ball)
Route a query to one of ≤11 hot patterns	`ruvllm_hnsw_route`
Cross-namespace search	`memory_search_unified`

Standard search

Check status — mcp__claude-flow__embeddings_status to verify the embedding engine.
Initialize — mcp__claude-flow__embeddings_init if not active.
Generate — mcp__claude-flow__embeddings_generate for text input.
Search — mcp__claude-flow__embeddings_search with the query.
Compare — mcp__claude-flow__embeddings_compare to measure similarity.
Unified search — mcp__claude-flow__memory_search_unified for cross-namespace.

Quantized search (32× memory reduction)

For corpora ≥5,000 vectors and/or memory-constrained environments, use the RaBitQ 1-bit quantization workflow. Below 5,000 vectors the rebuild cost outweighs the savings — use the standard path instead.

Step	Tool	Purpose
1	`embeddings_init`	Engine warm
2	`embeddings_rabitq_build`	One-time build of the 1-bit index after corpus is loaded
3	`embeddings_rabitq_search`	Hamming-prefilter returns top-N candidate IDs (cheap)
4	`embeddings_search`	Optional exact rerank on the candidate set (full-precision)
5	`embeddings_rabitq_status`	Index health, memory footprint, build time

Note: embeddings_rabitq_search returns candidate IDs only — the rerank in step 4 is the user's responsibility (mirrors the docstring at embeddings-tools.ts:911). Without rerank, results are approximate; with rerank, you get full-precision quality at 32× lower memory.

Tuning

HNSW exposes three knobs that trade recall against latency. The "12,500×" headline assumes defaults; tune deliberately for your workload:

Profile	`efSearch`	`M`	When to use
`recall-first`	200	32	Pattern recall during planning; quality matters more than ms
`balanced` (default)	64	16	General-purpose semantic recall
`latency-first`	16	8	Hot-path routing where p99 latency matters

efSearch is passed via ruvllm_hnsw_create (ruvllm-tools.ts:64). M is registry-level today; raise as a follow-up if it should be MCP-tunable. efConstruction defaults to 200 in the lite index (hnsw-index.ts:537).

HNSW pattern router (WASM, ≤11 patterns)

For routing a small number of high-priority patterns:

mcp__claude-flow__ruvllm_hnsw_create — create the WASM index (cap ~11)
mcp__claude-flow__ruvllm_hnsw_add — add a pattern
mcp__claude-flow__ruvllm_hnsw_route — route an incoming query

This is not a corpus index. Treat it as a fast classifier over a curated set of patterns.

Hyperbolic embeddings

For hierarchical data (code trees, org charts), use mcp__claude-flow__embeddings_hyperbolic which maps to Poincare ball space. Distance is geodesic, not cosine.

CLI alternative

bash

npx @claude-flow/cli@latest embeddings search --query "authentication patterns"
npx @claude-flow/cli@latest embeddings init
npx @claude-flow/cli@latest memory search --query "your query"

Performance

Method	Speed
Brute-force scan	Baseline
HNSW (n=500, balanced)	~150× faster
HNSW (n=10,000, balanced)	~12,500× faster
RaBitQ + rerank (n=10,000)	~12,500× search speed at 32× lower memory
`ruvllm_hnsw_route` (n≤11)	sub-ms per route, fixed cost