apps/docs/search/overview.mdx
Before searching memories, you need to set up the Supermemory client:
npm install supermemory
pip install supermemory
import Supermemory from 'supermemory';
const client = new Supermemory({
apiKey: process.env.SUPERMEMORY_API_KEY!
});
from supermemory import Supermemory
import os
client = Supermemory(
api_key=os.environ.get("SUPERMEMORY_API_KEY")
)
Full-featured search with extensive control over ranking, filtering, thresholds, and result structure. Searches through and returns relevant documents. More flexibility.
Minimal-latency search optimized for chatbots and conversational AI. Searches through and returns memories. Simple parameters, fast responses, easy to use.
The key difference between /v3/search and /v4/search is documents vs memories. /v3/search searches through the documents and returns matching chunks, whereas /v4/search searches through user's memories, preferences and history.
Refer to the ingestion guide to learn more about the difference between documents and memories.
/v3/search)High quality documents search - extensive parameters for fine-tuning search behavior:
{
"results": [
{
"documentId": "doc_abc123",
"title": "Machine Learning Fundamentals",
"type": "pdf",
"score": 0.89,
"chunks": [
{
"content": "Machine learning is a subset of artificial intelligence...",
"score": 0.95,
"isRelevant": true
}
],
"metadata": {
"category": "education",
"author": "Dr. Smith",
"difficulty": "beginner"
},
"createdAt": "2024-01-15T10:30:00Z",
"updatedAt": "2024-01-20T14:45:00Z"
}
],
"timing": 187,
"total": 1
}
The /v3/search endpoint returns the most relevant documents and chunks from those documents. Head over to the response schema page to understand more about the response structure.
/v4/search)Search through user memories:
Companies like Composio Rube.app use memories search for letting the MCP automate better based on the user prompts before.
<Info> This endpoint works best for conversational AI use cases like chatbots. </Info>Hybrid Search Mode:
The /v4/search endpoint supports a searchMode parameter with two options:
"memories" (default): Searches only memory entries. Returns results with a memory key containing the memory content."hybrid": Searches memories first, then falls back to document chunks if needed. Returns mixed results where each result object has either a memory key (for memory results) or a chunk key (for chunk results from documents).// Hybrid search (memories + chunks)
const hybridResults = await client.search.memories({
q: "machine learning accuracy",
limit: 5,
containerTag: "research",
threshold: 0.7,
searchMode: "hybrid" // Search memories + fallback to chunks
});
```
# Hybrid search (memories + chunks)
hybrid_results = client.search.memories(
q="machine learning accuracy",
limit=5,
container_tag="research",
threshold=0.7,
search_mode="hybrid" # Search memories + fallback to chunks
)
```
# Hybrid search (memories + chunks)
curl -X POST "https://api.supermemory.ai/v4/search" \
-H "Authorization: Bearer $SUPERMEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"q": "machine learning accuracy",
"limit": 5,
"containerTag": "research",
"threshold": 0.7,
"rerank": true,
"searchMode": "hybrid"
}'
```
{
"results": [
{
"id": "mem_xyz789",
"memory": "Complete memory content about quantum computing applications...",
"similarity": 0.87,
"metadata": {
"category": "research",
"topic": "quantum-computing"
},
"updatedAt": "2024-01-18T09:15:00Z",
"version": 3,
"context": {
"parents": [
{
"memory": "Earlier discussion about quantum theory basics...",
"relation": "extends",
"version": 2,
"updatedAt": "2024-01-17T16:30:00Z"
}
],
"children": [
{
"memory": "Follow-up questions about quantum algorithms...",
"relation": "derives",
"version": 4,
"updatedAt": "2024-01-19T11:20:00Z"
}
]
},
"documents": [
{
"id": "doc_quantum_paper",
"title": "Quantum Computing Applications",
"type": "pdf",
"createdAt": "2024-01-10T08:00:00Z"
}
]
}
],
"timing": 156,
"total": 1
}
The /v4/search endpoint searches through and returns memories. With searchMode="hybrid", it can also return document chunks when memories aren't found, providing comprehensive search coverage.
If you don't need semantic search and just want to retrieve a specific document you've uploaded by its ID, use the GET document endpoint:
GET /v3/documents/{id}
This is useful when:
// Get a specific document by ID
const document = await client.documents.get("doc_abc123");
console.log(document.content); // Full document content
console.log(document.status); // Processing status
console.log(document.metadata); // Document metadata
console.log(document.summary); // AI-generated summary
# Get a specific document by ID
document = client.documents.get("doc_abc123")
print(document.content) # Full document content
print(document.status) # Processing status
curl -X GET "https://api.supermemory.ai/v3/documents/{YOUR-DOCUMENT-ID}" \
-H "Authorization: Bearer $SUPERMEMORY_API_KEY"
/v3/search) Flowgraph TD
A[Query Input] --> B{Rewrite Query?}
B -->|Yes| C[Query Rewriting +400ms]
B -->|No| D[Generate Embeddings]
C --> E[Generate Rewritten Embeddings]
D --> F[Search Execution]
E --> F
F --> G[Apply Filtering
metadata, categories, containerTags]
G --> H{Rerank?}
H -->|Yes| I[Apply Reranking]
H -->|No| J[Build Results with Chunks]
I --> J
J --> K[Return Documents + Chunks + Scores]
/v4/search) Flowgraph TD
A[Query Input] --> B[Query Rewriting + Embedding]
B --> C[Parallel Search Execution]
C --> D[Apply Filtering]
D --> E[Merge Results]
E --> F[Deduplication]
F --> G{Rerank?}
G -->|Yes| H[Apply Reranking]
G -->|No| I[Return Memories + Similarity]
H --> I
Thresholds control result quality vs quantity:
// Different threshold strategies
const broadSearch = await client.search.documents({
q: "machine learning",
chunkThreshold: 0.2, // Return more chunks
documentThreshold: 0.1 // From more documents
});
const preciseSearch = await client.search.documents({
q: "machine learning",
chunkThreshold: 0.8, // Only highly relevant chunks
documentThreshold: 0.7 // From closely matching documents
});
By default, Supermemory returns chunks with context (surrounding text):
// Default: includes surrounding chunks for context
const contextualResults = await client.search.documents({
q: "neural networks",
onlyMatchingChunks: false // Default
});
// Precise: only the exact matching text
const exactResults = await client.search.documents({
q: "neural networks",
onlyMatchingChunks: true
});
Query Rewriting (+400ms latency):
Reranking:
Two different filtering mechanisms:
When to use container tags:
// Container tags: Organizational grouping (exact array matching)
const userContent = await client.search.documents({
q: "python tutorial",
containerTag "user_123" // Must match exactly
});
// Metadata filters: SQL-based queries (flexible conditions)
const filteredContent = await client.search.documents({
q: "python tutorial",
filters: JSON.stringify({
AND: [
{ key: "language", value: "python", negate: false },
{ key: "difficulty", value: "beginner", negate: false }
]
})
});