apps/docs/concepts/super-rag.mdx
Supermemory doesn't just store your content—it transforms it into optimized, searchable knowledge. Every upload goes through an intelligent pipeline that extracts, chunks, and indexes content in the ideal way for its type.
When you add content, Supermemory:
// Just add content — Supermemory handles the rest
await client.add({
content: pdfBase64,
contentType: "pdf",
title: "Technical Documentation"
});
No chunking strategies to configure. No embedding models to choose. It just works.
Different content types need different chunking strategies. Supermemory applies the optimal approach automatically:
PDFs and documents are chunked by semantic sections — headers, paragraphs, and logical boundaries. This preserves context better than arbitrary character splits.
├── Executive Summary (chunk 1)
├── Introduction (chunk 2)
├── Section 1: Architecture
│ ├── Overview (chunk 3)
│ └── Components (chunk 4)
└── Conclusion (chunk 5)
Code is chunked using code-chunk, our open-source library that understands AST (Abstract Syntax Tree) boundaries:
// A 500-line file becomes meaningful chunks:
// - Imports + type definitions
// - Each function as a separate chunk
// - Class methods individually indexed
This means searching for "authentication middleware" finds the actual function, not a random slice of code.
URLs are fetched, cleaned of navigation/ads, and chunked by article structure — headings, paragraphs, lists.
Chunked by heading hierarchy, preserving the document structure.
See Content Types for the full list of supported formats.
Supermemory combines the best of both approaches in every search:
<CardGroup cols={2}> <Card title="Traditional RAG" icon="magnifying-glass"> - Finds similar document chunks - Great for knowledge retrieval - Stateless — same results for everyone </Card> <Card title="Memory System" icon="brain"> - Extracts and tracks user facts - Understands temporal context - Personalizes results per user </Card> </CardGroup>With searchMode: "hybrid" (the default), you get both:
const results = await client.search({
q: "how do I deploy the app?",
containerTag: "user_123",
searchMode: "hybrid"
});
// Returns:
// - Deployment docs from your knowledge base (RAG)
// - User's previous deployment preferences (Memory)
// - Their specific environment configs (Memory)
Two flags give you fine-grained control over result quality:
Re-scores results using a cross-encoder model for better relevance:
const results = await client.search({
q: "complex technical question",
rerank: true // +~100ms, significantly better ranking
});
When to use: Complex queries, technical documentation, when precision matters more than speed.
Expands your query to capture more relevant results:
const results = await client.search({
q: "how to auth",
rewriteQuery: true // Expands to "authentication login oauth jwt..."
});
When to use: Short queries, user-facing search, when recall matters.
| Traditional RAG | SUPER RAG |
|---|---|
| Manual chunking config | Automatic per content type |
| One-size-fits-all splits | AST-aware code chunking |
| Just document retrieval | Hybrid memory + documents |
| Static embeddings | Relationship-aware graph |
| Generic search | Rerank + query rewriting |
You focus on building your product. Supermemory handles the RAG complexity.