docs/migration/platform-v2-to-v3.mdx
The new Mem0 memory algorithm is a ground-up redesign of how memories are extracted, stored, and retrieved. It scores 91.6 on LoCoMo and 93.4 on LongMemEval — a +20 and +26 point improvement over the previous algorithm — while cutting extraction latency roughly in half.
| What Changed | Before | After |
|---|---|---|
| Extraction | Two LLM passes (extract + merge) | Single-pass ADD-only (one LLM call) |
| Memory mutations | ADD, UPDATE, DELETE | ADD only — nothing is overwritten or deleted |
| Agent-generated facts | Often ignored | First-class, stored with equal weight |
| Entity linking | Not available | Entities extracted and linked across memories |
| Graph memory | Separate graph store + dashboard visualization | Replaced by built-in entity linking, no graph visuals on platform dashboard |
| Retrieval | Semantic (vector) only | Hybrid retrieval combining multiple signals |
The previous algorithm could UPDATE or DELETE existing memories during extraction. The new algorithm only adds new facts. When information changes (e.g., a user moves from New York to San Francisco), both facts are preserved with temporal context. This means:
Previously, when an agent said something like "I've booked your flight for March 3rd," the system would often ignore it and only store what the user explicitly stated. The new algorithm treats agent-generated facts as first-class memories. If your application involves agents that confirm actions, provide recommendations, or share information, you'll see significantly better recall on those interactions.
Search now uses hybrid retrieval, which improves ranking quality — especially for queries involving exact keywords, proper nouns, or entities that appear across multiple memories. The response shape is unchanged:
{
"results": [
{
"id": "mem-uuid",
"memory": "User moved to San Francisco in January 2026",
"score": 0.82,
"metadata": {},
"categories": ["location"]
}
]
}
The top-level score remains a [0, 1] value. Relative ranking between results stays comparable to v2, but absolute numbers shift since the scoring method changed — retune any hard thresholds in your app against representative queries.
The new algorithm is available through the V3 API. The endpoints split into per-operation paths:
| Operation | SDK method | Endpoint |
|---|---|---|
| Add memories | client.add() | POST /v3/memories/add/ |
| Search memories | client.search() | POST /v3/memories/search/ |
| Get all memories (paginated) | client.get_all() | POST /v3/memories/ |
client = MemoryClient(api_key="your-api-key")
result = client.add( messages=[ {"role": "user", "content": "I just moved to San Francisco from New York"}, {"role": "assistant", "content": "That's exciting! I'll update your location preferences."} ], user_id="alice" )
results = client.search( query="where does the user live?", filters={"user_id": "alice"} )
page = client.get_all(filters={"user_id": "alice"}, page=1, page_size=50)
```bash cURL
# Add memories
curl -X POST https://api.mem0.ai/v3/memories/add/ \
-H "Authorization: Token your-api-key" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "I just moved to San Francisco from New York"},
{"role": "assistant", "content": "That'\''s exciting! I'\''ll update your location preferences."}
],
"user_id": "alice"
}'
# Search memories
curl -X POST https://api.mem0.ai/v3/memories/search/ \
-H "Authorization: Token your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "where does the user live?",
"filters": {"user_id": "alice"}
}'
# List memories (paginated)
curl -X POST 'https://api.mem0.ai/v3/memories/?page=1&page_size=50' \
-H "Authorization: Token your-api-key" \
-H "Content-Type: application/json" \
-d '{"filters": {"user_id": "alice"}}'
| Parameter | V1/V2 | V3 | Notes |
|---|---|---|---|
top_k | Supported | Supported (1-1000, default 10) | No change |
threshold | Default: none | Default: 0.1 | Pass 0.0 to disable |
rerank | Default: true | Default: false | Pass true to enable (adds latency) |
Entity IDs in search / get_all | Top-level | Inside filters dict | Top-level raises 400 |
Add response — asynchronous, returns an event_id for polling:
{
"message": "Memory processing has been queued for background execution",
"status": "PENDING",
"event_id": "evt-uuid"
}
Poll status via GET /v1/event/{event_id}/ — status will be SUCCEEDED or FAILED.
Search response — combined multi-signal score per result:
{
"results": [
{
"id": "mem-uuid",
"memory": "User moved to San Francisco from New York in January 2026",
"score": 0.82,
"metadata": {},
"categories": ["location"],
"created_at": "2026-01-15T10:30:00Z",
"updated_at": "2026-01-15T10:30:00Z"
}
]
}
List response — paginated envelope (new in V3):
{
"count": 123,
"next": "https://api.mem0.ai/v3/memories/?page=2&page_size=50",
"previous": null,
"results": [
{
"id": "mem-uuid",
"memory": "...",
"metadata": {},
"categories": [],
"created_at": "2026-01-15T10:30:00Z",
"updated_at": "2026-01-15T10:30:00Z"
}
]
}
Alongside the algorithm update, the Python and TypeScript client SDKs have been cleaned up. These changes affect how you initialize and call the client.
from mem0 import MemoryClient
# Before
client = MemoryClient(
api_key="...",
org_id="org-1", # [REMOVED] Removed
project_id="proj-1" # [REMOVED] Removed
)
client.add(messages, user_id="alice", async_mode=True, output_format="v1.1")
# After
client = MemoryClient(api_key="...")
client.add(messages, user_id="alice")
# async_mode and output_format removed (async by default, v1.1 always)
Removed parameters: org_id, project_id, api_version, output_format, async_mode, enable_graph, immutable, expiration_date, filter_memories, batch_size, force_add_only, includes, excludes, keyword_search, org_name, project_name
All parameters now use camelCase (the SDK handles conversion to/from the API automatically):
// Before
const client = new MemoryClient({
apiKey: "...",
organizationId: "org-1", // [REMOVED] Removed
projectId: "proj-1" // [REMOVED] Removed
});
await client.search("query", {
user_id: "alice", // [REMOVED] snake_case
top_k: 20, // [REMOVED] snake_case
enable_graph: true // [REMOVED] Removed
});
// After
const client = new MemoryClient({ apiKey: "..." });
await client.search("query", {
filters: { userId: "alice" }, // [OK] inside filters
topK: 20 // [OK] camelCase
});
Removed: OutputFormat enum, API_VERSION enum, organizationId, projectId, organizationName, projectName, enableGraph, asyncMode, outputFormat, immutable, expirationDate, filterMemories, batchSize, forceAddOnly, includes, excludes, keywordSearch
Graph memory has been replaced by built-in entity linking. The changes:
enable_graph project setting removed. The toggle is gone from the dashboard; the API parameter is ignored.score returned on each result.No migration work is required. Entity linking activates automatically for all projects on the new algorithm. Existing memories are not re-processed, but any new memories you add will be indexed for entity-based retrieval going forward.
<Note> If your application previously read graph relations from the API response (`relations` field on search results), note that this field is no longer populated. Entity relationships are now consumed indirectly through retrieval ranking, not exposed as a separate graph structure. </Note>score and results[] array are the same; existing code that reads score continues to work. What changed is the scoring method behind the number (multi-signal fusion instead of pure cosine), so the absolute values shift even when ranking stays comparable.get_all now returns a paginated envelope ({count, next, previous, results}) instead of a bare {results: [...]}. Update code that reads response["results"] to continue working, or switch to the client SDKs which handle both shapes.| Metric | Previous Algorithm | New Algorithm |
|---|---|---|
| LoCoMo Overall | 71.4 | 91.6 (+20.2) |
| LongMemEval Overall | 67.8 | 93.4 (+25.6) |
| Extraction latency (p50) | ~2.0s | ~1.0s |
| Mean tokens per query | — | 6.8-7.0K (top200) |
All benchmarks were run on a production-representative stack — deliberately avoiding frontier models to keep numbers representative of real production workloads.
If you run into issues during migration or have questions about the new algorithm: