docs/api.md
REST V1 is mounted under /v1; legacy worker routes remain under /api.
Available beta endpoints:
GET /healthzGET /v1/infoGET /v1/projectsPOST /v1/projectsGET /v1/projects/:idPOST /v1/sessions/startPOST /v1/sessions/:id/endGET /v1/sessions/:idPOST /v1/eventsPOST /v1/events/batchGET /v1/events/:idPOST /v1/memoriesGET /v1/memories/:idPATCH /v1/memories/:idPOST /v1/searchPOST /v1/contextALL /v1/mcp (remote MCP recall — see below)POST /v1/keysGET /v1/connectGET /v1/usageDELETE /v1/memories/:idDELETE /v1/projects/:projectId/memoryGET /v1/audit?projectId=<id>When CLAUDE_MEM_AUTH_MODE=api-key, send Authorization: Bearer <key>. Read endpoints require memories:read; write endpoints require memories:write.
These paid-readiness guards run after auth and are opt-in via env — unset (the default) means no rate limit, no quota, and no metering, so behavior is unchanged.
CLAUDE_MEM_RATE_LIMIT_PER_MIN — max requests per API key per minute. Over the
limit returns 429 with Retry-After (and X-RateLimit-* headers). Fail-open.CLAUDE_MEM_MONTHLY_REQUEST_CAP — max requests per team per calendar month
(UTC). At the cap, returns 402 quota_exceeded. Fail-open.CLAUDE_MEM_MONTHLY_TOKEN_CAP — max provider tokens per team per month. Gates
writes only (ingestion drives generation = token spend); reads stay
available so a team over budget can still recall. 402 at the cap. Fail-open.CLAUDE_MEM_USAGE_METERING=1 — record one request usage event per
authenticated call (fire-and-forget). Token/observation metering writes to the
same usage_events table from the generation worker.GET /v1/usage returns the caller team's per-kind totals for the current month:
{ "since": "2026-06-01T00:00:00.000Z", "usage": { "request": 1280, "observation": 44 } }
POST /v1/keys (write scope) mints a read-only API key for the caller's
team and returns the paste-ready connect command. The raw key is shown once.
Body: { "expiresInDays"?: number }. Minting requires write scope so a read key
can't escalate into more keys.
{
"id": "...", "apiKey": "cm_...", "scopes": ["memories:read"], "expiresAt": null,
"mcpUrl": "https://<host>/v1/mcp",
"connectCommand": "claude mcp add --transport http claude-mem https://<host>/v1/mcp --header \"Authorization: Bearer cm_...\""
}
GET /v1/connect (read scope) returns the same command with a <YOUR_API_KEY>
placeholder (a GET never mints). mcpUrl is built from CLAUDE_MEM_PUBLIC_URL
(recommended behind a proxy) or the request host.
Cold-start note: minting the team's first key still needs a session-gated path (web dashboard). better-auth's
apiKey()plugin exists but writes to a separate store than the Postgresapi_keysthese routes authenticate against — wiring the better-auth org → Server Beta team mapping is the remaining piece.
POST /v1/events accepts two query flags that control observation generation:
generate=false — write the event but do not enqueue a generation job.wait=true — return the generationJob descriptor in the response, so
callers can poll GET /v1/jobs/:id for completion.Without wait=true, the response includes the new event row and a best-
effort generationJob field. With wait=true, the generationJob field is
always populated (or null only when generation was explicitly disabled).
The actual provider call happens in a separate BullMQ worker process
(claude-mem server worker start); the HTTP path never blocks on a
provider response.
/v1/mcp is a streamable-HTTP MCP server —
the secure, authenticated link a user pastes into Claude Code (or any MCP
client) to recall their cloud memory. It is read-only and authenticated by the
same API key as the REST routes (memories:read); the key's team (and project,
if the key is project-scoped) bound every read.
Connect:
claude mcp add --transport http claude-mem <server-base>/v1/mcp \
--header "Authorization: Bearer cm_..."
Tools:
search — { projectId, query, limit? } → matching observations (FTS, same
path as POST /v1/search).context — { projectId, query, limit? } → observations plus a concatenated
context string ready for prompt injection (same path as POST /v1/context).recent — { projectId, limit? } → the newest observations for a project.The transport is stateless: one MCP server + transport per request, so it needs no session affinity behind a load balancer. Mutating tools are intentionally absent — a pasted recall link cannot write.
Right-to-erasure. Both require write scope and are scoped to the caller's team.
DELETE /v1/memories/:id — delete a single observation (its sources cascade). 404 if it doesn't exist for the team.DELETE /v1/projects/:projectId/memory — purge ALL captured content for a project (observations, agent events, sessions, generation jobs); keeps the project shell. Returns per-table counts. 404 if the project doesn't belong to the team. Both are audited (observation.deleted / project.memory_purged).