Back to Claude Mem

cmem-sdk: Embeddable claude-mem I/O on the Server (Postgres) Runtime + `server-beta` → `server` Rename

plans/2026-05-25-cmem-sdk-and-server-rename.md

13.9.133.1 KB
Original Source

cmem-sdk: Embeddable claude-mem I/O on the Server (Postgres) Runtime + server-betaserver Rename

Status: implementation plan Date: 2026-05-25 Release target: claude-mem 13.x — claude-mem/sdk export Relationship to prior plans:

  • Builds on plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md and plans/2026-05-07-claude-mem-server-apache-bullmq-team-auth.md (the Postgres "server" runtime).
  • Completes the deferred SDK export slot added in commit ae454cfc ("feat: add SDK exports for consumer app integration") — that commit added exports["."] and exports["./sdk"] to package.json but never added src/index.ts, never added src/sdk/index.ts, and never added a build step that emits them.
  • Removes the "beta" tag from the server runtime because the literal string server-beta is the source of silent runtime regressions (see Phase 1).

Executive Decision

The cmem-sdk is not a new system. It is the existing in-process server runtime (src/server/* + src/storage/postgres/* + the existing src/services/sync/* Chroma engine) exposed as an importable library with the HTTP/daemon/Redis shell removed. Everything the SDK needs already exists and is already daemon-free at its core; the SDK is composition + packaging, plus one careful rename.

consumer app
  └─ import { createCmemClient } from 'claude-mem/sdk'
       ├─ Postgres (pg)         ← system of record (capture, observations, sessions, jobs)  [src/storage/postgres/*]
       ├─ in-process generation ← provider.generate() (fetch) → parseAgentXml → processGeneratedResponse  [src/server/generation/*]
       ├─ Chroma (REQUIRED)     ← semantic index over the SAME observations, via uvx chroma-mcp subprocess  [src/services/sync/*]
       └─ search                ← Chroma semantic (primary). Postgres FTS exists only as a runtime safety net when Chroma transiently fails — it is NOT a feature toggle.  [src/storage/postgres/observations.ts]

Chroma is not optional. claude-mem without semantic search is broken — observations are unsearchable in the way users actually search them. The SDK MUST initialize and verify Chroma at construction; if uvx chroma-mcp cannot start, createCmemClient(...) rejects. The Postgres FTS path is preserved only to mirror SearchManager.ts:255's runtime resilience pattern (transient Chroma death mid-session); it is logged loudly when used and is not exposed as a user-configurable mode.

What the SDK must not pull in: Express, BullMQ, ioredis/Redis, better-auth, the HTTP routes, the daemon/pidfile lifecycle, the worker's bun:sqlite storage, or the Claude Code subprocess generation path. All of these are the shell around the reusable core.

The wiring hub to study is createServerBetaService() (src/server/runtime/create-server-beta-service.ts:156). It already builds exactly the object graph the SDK wants (pool → schema bootstrap → repositories), then attaches the parts the SDK drops (HTTP service, queue manager, generation worker). The SDK reproduces the graph, not the service.

Terminology Decision (inherited, enforced)

  • The domain object is an observation, never "memory". Keep observations, observation_sources, PostgresObservationRepository, /v1/observations. /v1/memories and memory_* MCP tools are aliases only.
  • The runtime is server (this plan), never "server-beta". worker remains the legacy SQLite runtime.
  • The public client is CmemClient, constructed by createCmemClient(...), imported from claude-mem/sdk.

Phase 0: Documentation Discovery

Local sources read (with what each established)

SourceEstablished
src/server/runtime/create-server-beta-service.tsThe wiring hub. Graph = pool + bootstrapServerBetaPostgresSchema + createPostgresStorageRepositories; queue manager disabled unless CLAUDE_MEM_QUEUE_ENGINE=bullmq (:255-263); generation worker disabled unless queue active + provider configured (:195-216). Env-driven provider build at :218-247.
src/storage/postgres/index.tscreatePostgresStorageRepositories(client: PostgresQueryable) :39 returns all repos.
src/storage/postgres/pool.tscreatePostgresPool(config) :13, getSharedPostgresPool({requireDatabaseUrl}) :24, withPostgresTransaction(pool, fn) :45.
src/storage/postgres/config.tsparsePostgresConfig() :26, reads CLAUDE_MEM_SERVER_DATABASE_URL (the only connection var) + pool/SSL tuning.
src/storage/postgres/schema.tsbootstrapServerBetaPostgresSchema(client) :22 — idempotent in-process migration runner, no extensions, no pgvector. observations table DDL :212-227: content TEXT, content_search TSVECTOR GENERATED ALWAYS, embedding JSONB (nullable, unused), GIN index :274.
src/storage/postgres/observations.tscreate(...) :72, search({projectId, teamId, query, limit}) :153 (FTS via websearch_to_tsquery + ts_rank), getByIdForScope :120, listByProject :133. No vector search. embedding is written-only-if-passed, never read.
src/storage/postgres/agent-events.tsPostgresAgentEventsRepository.create(input) :63, CreatePostgresAgentEventInput :31.
src/storage/postgres/generation-jobs.tscreate(...) :101, transitionStatus({status}) :164 (legal transitions only — queued→processing→completed; queued→completed is illegal, :390).
src/storage/postgres/server-sessions.ts / projects.ts / teams.tsSession/project/team repos. Projects have create({teamId,name}) :27 + getByIdForTeam(id,teamId) :46no lookup-by-name (tenancy bootstrap implication, see Phase 3).
src/server/services/IngestEventsService.tsingestOne(input,{generate}) :96 writes agent_event + generation_job outbox in one tx; resolveEventQueue: () => null (IngestEventsServiceOptions :64) makes BullMQ enqueue a no-op (enqueueState='queued_only').
src/server/generation/providers/shared/types.tsServerGenerationProvider.generate(context, signal?) :30; ServerGenerationContext :9 = {job, events, project:{projectId,teamId,serverSessionId,projectName}}; ServerGenerationResult :23 = {rawText, modelId?, providerLabel, tokensUsed?}.
src/server/generation/providers/ClaudeObservationProvider.tsconstructor({apiKey, model?, maxOutputTokens?, fetchImpl?}) :32; plain fetch to https://api.anthropic.com/v1/messages :69. No @anthropic-ai/claude-agent-sdk, no subprocess. Default model claude-3-5-sonnet-latest :17. (Gemini/OpenRouter siblings: same shape.)
src/server/generation/processGeneratedResponse.tsprocessGeneratedResponse({pool, job, rawText, modelId, providerLabel, ...}) :62 and processSessionSummaryResponse :332 — wrap withPostgresTransaction, parse via parseAgentXml, write observations + observation_sources, complete the job. Never writes embedding.
src/server/generation/ProviderObservationGenerator.tsThe reusable inline core is :196-254 (load events → load project → provider.generateprocessGeneratedResponse); :72-194 is BullMQ ceremony to skip.
src/sdk/parser.tsparseAgentXml(raw, correlationId?) :41. Calls ModeManager.getInstance().getActiveMode() at :105 with no fallback — SDK must initialize a mode.
src/sdk/prompts.tsbuildObservationPrompt/buildSummaryPrompt/buildInitPrompt/buildContinuationPrompt (mode-driven).
src/server/generation/providers/shared/prompt-builder.tsbuildServerGenerationPrompt(context) :42 — has loadActiveModeOrFallback() :46 (graceful), unlike parser.ts.
src/services/sync/ChromaSync.tsconstructor(project) :69, collection cm__<project> :74; document layer addDocuments(ChromaDocument[]) :234 (id/document/metadata → chroma_add_documents) is storage-agnostic; syncObservation(observationId:number, …, obs:ParsedObservation, …) :306 is SQLite-shaped (integer id, StoredObservation); queryChroma(...) :855; close() :1096.
src/services/sync/ChromaMcpManager.tsSingleton getInstance() :56; spawns uvx chroma-mcp subprocess; callTool(name, args). Local all-MiniLM embeddings, no API key.
src/services/worker/DatabaseManager.tsReference composition: new ChromaSync('claude-mem') :26. The worker reads a CLAUDE_MEM_CHROMA_ENABLED !== 'false' env gate; the SDK deliberately does NOT honor that gate — Chroma is required (see Executive Decision). The gate is the worker's footgun and should not propagate.
src/services/worker/SearchManager.tsReference search semantics: search() :140 does Chroma semantic with FTS fallback on Chroma failure (:255). The SDK mirrors this branch logic against Postgres.
src/services/hooks/runtime-selector.tsRegression source. selectRuntime() :35 requires CLAUDE_MEM_RUNTIME === 'server-beta' exactly; else silent worker fallback (:71-78). Settings keys CLAUDE_MEM_SERVER_BETA_{URL,API_KEY,PROJECT_ID} :41-43.
src/shared/SettingsDefaultsManager.tsKeys CLAUDE_MEM_SERVER_BETA_* :76-78, defaults :151-154; CLAUDE_MEM_RUNTIME default 'worker' :151.
src/services/worker-service.tsDispatch runServerBetaServiceCli :850, looks for server-beta-service.cjs :851; server <cmd> subcommand :1040.
scripts/build-hooks.jsBuild target name:'server-beta-service' :16; emits dist/npx-cli + dist/opencode-plugin only — never emits dist/index.js or dist/sdk/. Has a bun:sqlite import guard precedent at :262.
package.jsonexports["."]dist/index.js and exports["./sdk"]dist/sdk/index.js (both currently resolve to nonexistent files). pg is a prod dep and is pure-Node.

Allowed APIs (verbatim signatures — do NOT invent or extend)

Connection / boot

  • parsePostgresConfig(options?): PostgresConfig | nullconfig.ts:26
  • createPostgresPool(config: PostgresConfig): PostgresPoolpool.ts:13
  • getSharedPostgresPool(options?: { requireDatabaseUrl?: boolean }): PostgresPoolpool.ts:24
  • bootstrapServerBetaPostgresSchema(client: PostgresQueryable): Promise<void>schema.ts:22 (renamed in Phase 1 → bootstrapServerPostgresSchema)
  • createPostgresStorageRepositories(client: PostgresQueryable): PostgresStorageRepositoriesindex.ts:39
  • withPostgresTransaction<T>(pool, fn): Promise<T>pool.ts:45
  • PostgresQueryable = { query(text, values?) }utils.ts:9 (a pg.Pool or pg.PoolClient satisfies it)

Capture

  • new IngestEventsService({ pool, resolveEventQueue: () => null }) then ingestOne(input, { generate })IngestEventsService.ts:93,96
  • PostgresAgentEventsRepository.create(input: CreatePostgresAgentEventInput)agent-events.ts:63

Generation (inline, no BullMQ)

  • new ClaudeObservationProvider({ apiKey, model? }) (or Gemini/OpenRouter) — ClaudeObservationProvider.ts:32
  • PostgresObservationGenerationJobRepository.transitionStatus({ id, projectId, teamId, status:'processing', lockedBy })generation-jobs.ts:164
  • provider.generate({ job, events, project }, signal?)providers/shared/types.ts:30
  • processGeneratedResponse({ pool, job, rawText, modelId, providerLabel, ... })processGeneratedResponse.ts:62
  • processSessionSummaryResponse({ ... })processGeneratedResponse.ts:332
  • parseAgentXml(raw, correlationId?)parser.ts:41 (requires an active ModeManager mode)

Search

  • PostgresObservationRepository.search({ projectId, teamId, query, limit? })observations.ts:153
  • PostgresObservationRepository.getByIdForScope({ id, projectId, teamId }) / listByProject(...)observations.ts:120,133

Chroma (REQUIRED semantic; reuse, don't fork)

  • ChromaMcpManager.getInstance() + callTool('chroma_add_documents' | 'chroma_query_documents' | 'chroma_create_collection' | 'chroma_delete_documents', args)ChromaMcpManager.ts:56
  • new ChromaSync(project) + queryChroma(...) :855 + close() :1096; the ChromaDocument { id, document, metadata } + addDocuments :234 layer is the reusable, storage-agnostic seam.

Anti-patterns to guard against (this plan exists because these already bit us)

  1. Do not "build a hybrid", "adapt", "migrate", or "fork" anything. Every engine exists. The SDK is glue + packaging. If a task description contains "new system" or "reimplement", it is wrong.
  2. Do not add pgvector / a vector column / an embeddings API call. Postgres semantic search does not exist and is out of scope — semantic search is delivered by the existing Chroma engine (per explicit direction). FTS is the Postgres-side search.
  3. Do not pull Express, BullMQ, ioredis, better-auth, React, or bun:sqlite into the SDK bundle. The server generation providers use plain fetch (no @anthropic-ai/claude-agent-sdk). Enforce with a build-time import guard (Phase 2/9).
  4. Do not call transitionStatus(queued → completed) — it throws (generation-jobs.ts:390). You must transition queued → processing first (mirror lockOutbox).
  5. Do not call parseAgentXml without an active modeparser.ts:105 throws otherwise. Initialize ModeManager (or use the prompt-builder fallback semantics) in Phase 5.
  6. Do not blind string-replace the rename. Persisted values (DB table server_beta_schema_migrations, job_type/source_type enum strings, users' settings.json keys, the CLAUDE_MEM_RUNTIME value) need backward-compat. Only code identifiers rename freely.
  7. Do not re-run grep-only subagents and synthesize across fragments. Read the wiring hub and the composition root as wholes.

Phase 1: Rename server-betaserver (foundation + regression fix)

Why first: the SDK is built on the server runtime; ship it with clean naming. Independently shippable — fixes the silent-fallback regressions on its own.

1a. Fix the regression (highest-value, smallest change)

What to implement:

  • src/services/hooks/runtime-selector.ts:34-37: accept 'server' as the canonical runtime value, and also still accept 'server-beta' for back-compat. Update SelectedRuntime/ServerBetaRuntimeContext types (:17,19) to 'server'.
  • src/server/runtime/create-server-beta-service.ts:94-98,148: validateServerBetaEnv must accept 'server' (and 'server-beta') and stop emitting/refusing on the old literal.
  • src/shared/SettingsDefaultsManager.ts:76-78,151-154: add CLAUDE_MEM_SERVER_{URL,API_KEY,PROJECT_ID} keys; read new-key-then-old-key so existing settings.json files keep working. runtime-selector.ts:41-43 reads new keys with old-key fallback.

Verification:

  • With CLAUDE_MEM_RUNTIME=server + CLAUDE_MEM_SERVER_DATABASE_URL set, hooks resolve to the server runtime (not worker). Add a unit test asserting selectRuntime()==='server' for both 'server' and 'server-beta'.
  • rg -n "=== 'server-beta'" returns no equality checks that exclude 'server'.

Anti-pattern guard: do not drop 'server-beta' acceptance — that would re-break currently-working installs.

1b. Rename code identifiers (safe, mechanical)

What to implement: rename the ~80 ServerBeta* / serverBeta* / SERVER_BETA_* code symbols (classes, types, interfaces, vars, non-persisted constants) → Server* / server* / SERVER_*. Examples: ServerBetaService→ServerService, createServerBetaService→createServerService, ServerBetaClient→ServerClient, ActiveServerBetaQueueManager→ActiveServerQueueManager, ServerBetaServiceGraph→ServerServiceGraph, bootstrapServerBetaPostgresSchema→bootstrapServerPostgresSchema, SERVER_BETA_POSTGRES_SCHEMA_VERSION→SERVER_POSTGRES_SCHEMA_VERSION. Use the enumerated list from rg -io 'server[_-]?beta[a-z0-9_]*' (saved during discovery; ~80 distinct forms).

Doc references: the full surface is ~40 files; top density ServerBetaService.ts, create-server-beta-service.ts, server-beta-client.ts, mcp-server.ts, runtime/types.ts.

Verification: npm run typecheck passes; rg -i 'serverbeta' returns 0 in code identifiers.

Anti-pattern guard: exclude persisted literals (1d) from this pass.

1c. Rename files + build/dispatch target

What to implement:

  • Rename source files: create-server-beta-service.ts→create-server-service.ts, ServerBetaService.ts→ServerService.ts, server-beta-client.ts→server-client.ts, server-beta-bootstrap.ts→server-bootstrap.ts, ActiveServerBeta*.ts→ActiveServer*.ts, scripts/e2e-server-beta-docker.sh→scripts/e2e-server-docker.sh, docs docs/server-beta-*.md→docs/server-*.md. Update all imports.
  • scripts/build-hooks.js:16: build target server-beta-serviceserver-service (emits plugin/scripts/server-service.cjs). Update log lines :207,448.
  • src/services/worker-service.ts:850-854,1040: runServerBetaServiceClirunServerServiceCli, look for server-service.cjs. Keep the server <cmd> subcommand name (already correct).

Verification: npm run build succeeds and emits plugin/scripts/server-service.cjs; claude-mem server status dispatches correctly.

Anti-pattern guard: keep a fallback that still finds server-beta-service.cjs if present in an already-installed plugin cache, to avoid breaking mid-upgrade installs (or document a forced rebuild).

1d. Persisted values — backward-compat (the hazard)

What to implement (decide per item; recommended defaults below):

  • Schema migrations table server_beta_schema_migrations (schema.ts, referenced create-server-service.ts:274): add an idempotent, guarded ALTER TABLE IF EXISTS server_beta_schema_migrations RENAME TO server_schema_migrations; at the top of bootstrapServerPostgresSchema, then create server_schema_migrations IF NOT EXISTS. Update the SELECT ... FROM server_schema_migrations read. (Zero-risk alternative: keep the physical table name, rename only the TS constant.)
  • Job job_type/source_type enum strings (server_beta_generate_event, server_beta_generate_summary, server_beta_generate_event_batch, server_beta_reindex, server_beta_observation_request): on write emit server_*; on read/match accept both server_* and legacy server_beta_*. Add a tiny normalize helper. (Zero-risk alternative: keep the persisted literals, rename only the TS constant names that hold them.)
  • Settings keys / runtime value: handled in 1a (read new-then-old). Installer writes new keys + CLAUDE_MEM_RUNTIME=server going forward.

Verification: open an existing pre-rename Postgres DB → bootstrap runs clean, the migrations row is preserved/renamed, no duplicate tables; an existing settings.json with old keys still resolves the server runtime; a queued legacy server_beta_generate_event job still processes.

Anti-pattern guard: never DROP or recreate a populated table; never rename a column that holds historical enum values without dual-accept.


Phase 2: SDK package skeleton + build + export wiring

What to implement:

  • Create src/sdk/index.ts as the public entry (re-exports createCmemClient, CmemClient, and the public types). Leave existing src/sdk/parser.ts/prompts.ts in place (reused internally).
  • Create src/index.ts (the . export) re-exporting the SDK surface (so both claude-mem and claude-mem/sdk resolve). Keep . minimal.
  • Add a real build that emits both JS and .d.ts for the SDK targets, since npm run build does not today:
    • tsconfig.sdk.json (extends root, rootDir: src, outDir: dist, declaration: true, emitDeclarationOnly: false, types: ["node"] — drop "bun"), include only the SDK's transitive sources; or add tsup (devDep) with entries src/index.ts + src/sdk/index.ts, format: esm, dts: true, platform: node.
    • Add "build:sdk" script; chain it into build and prepublishOnly.
  • Confirm package.json exports map already matches (.dist/index.js, ./sdkdist/sdk/index.js); ensure files ships dist.

Doc references: broken-export evidence — package.json exports vs missing src/index.ts/src/sdk/index.ts; scripts/build-hooks.js emits only dist/npx-cli + dist/opencode-plugin; bun:sqlite guard precedent build-hooks.js:262.

Verification checklist:

  • npm run build produces dist/sdk/index.js and dist/sdk/index.d.ts.
  • From a scratch node project: import { createCmemClient } from 'claude-mem/sdk' resolves and types load.
  • Import guard: a build/test step greps the SDK bundle (or its resolved import graph) and fails if it references express, bullmq, ioredis, better-auth, react, or bun:sqlite.

Anti-pattern guards: do not tsc-emit the whole repo (drags in worker/bun:sqlite); scope the SDK build to its own entrypoints. Do not add @anthropic-ai/claude-agent-sdk as an SDK dep — the server providers use fetch.

Phase 3: SDK core — connection, schema bootstrap, repositories, tenancy

What to implement (copy the graph from create-server-service.ts:156-186, minus the service/queue/worker):

  • createCmemClient(options) where options = { databaseUrl?, pool?, teamId?, projectId?, provider?, chroma?: ChromaOptions }. chroma is for tuning Chroma (collection prefix, MCP path, etc.), not for disabling it. There is no enabled: false toggle.
    • Pool: options.pool ?? createPostgresPool(parsePostgresConfig({ env: { CLAUDE_MEM_SERVER_DATABASE_URL: options.databaseUrl ?? process.env... } })!) (or getSharedPostgresPool).
    • await bootstrapServerPostgresSchema(pool) (idempotent).
    • repos = createPostgresStorageRepositories(pool).
    • Chroma required: chromaSync = new ChromaSync(projectId); await chromaSync.ensureReady() (or first addDocuments/queryChroma call). If the uvx chroma-mcp subprocess fails to start, createCmemClient REJECTS with a clear error — the SDK does not return a half-working client.
  • Tenancy bootstrap: Postgres requires teamId + projectId on every call, and ProjectsRepository has no lookup-by-name (projects.ts:46 is getByIdForTeam). So:
    • If options.teamId/projectId provided → use them.
    • Else → ensureDefaults(): create a default team (teams.create({name:'default'})) + project (projects.create({teamId, name: options.projectName ?? 'default'})) once, and persist the IDs to the SDK's local state file (e.g. $CLAUDE_MEM_DATA_DIR/sdk-tenant.json) so subsequent runs reuse them. Document that production consumers should pass explicit IDs.

Doc references: create-server-service.ts:162-186; pool.ts:13,24; config.ts:26; index.ts:39; teams.ts:45; projects.ts:27,46.

Verification: createCmemClient({ databaseUrl }) connects, bootstraps schema idempotently (run twice → no error), exposes client.repos, and resolves a stable {teamId, projectId}.

Anti-pattern guard: do not require Redis/bullmq env (validateServerEnv Docker checks are for the HTTP container, not the SDK — the SDK never calls it). Do not invent a getProjectByName on the repo; persist IDs instead.

Phase 4: SDK capture API

What to implement:

  • client.capture(event) / client.captureBatch(events) wrapping new IngestEventsService({ pool, resolveEventQueue: () => null }).ingestOne(input, { generate: false }) — writes the agent_event + a queued generation-job outbox row, no Redis.
  • Map the SDK's friendly event shape → CreatePostgresAgentEventInput (agent-events.ts:31): { projectId, teamId, serverSessionId?, sourceAdapter, sourceEventId?, eventType, payload, occurredAt }.
  • Optionally expose client.startSession()/endSession() via PostgresServerSessionsRepository for grouping.

Doc references: IngestEventsService.ts:96 (ingestOne), :64 (resolveEventQueue returning nullqueued_only); agent-events.ts:31,63; server-sessions.ts.

Verification: after capture(...), exactly one agent_events row and one observation_generation_jobs row (status queued) exist for the tenant; no Redis connection attempted.

Anti-pattern guard: do not enqueue to BullMQ; resolveEventQueue must return null.

Phase 5: SDK generation/compression API (inline, no worker)

What to implement (reproduce ProviderObservationGenerator.ts:196-254; skip :72-194):

  • Provider: options.provider → instantiate ClaudeObservationProvider({apiKey, model?}) (or Gemini/OpenRouter), or reuse the env-driven buildServerGenerationProviderFromEnv() logic (create-server-service.ts:218-247).
  • Ensure an active mode for parseAgentXml (parser.ts:105): initialize ModeManager with the default mode at client construction (or wrap parse with the prompt-builder.ts:46 fallback).
  • client.generate(jobOrEventId):
    1. job = transitionStatus({ id, projectId, teamId, status:'processing', lockedBy:'sdk' }) (mandatory queued→processing).
    2. load events (agentEvents.getByIdForScope/listByProject) + project (projects.getByIdForTeam).
    3. result = await provider.generate({ job, events, project:{ projectId, teamId, serverSessionId, projectName } }).
    4. await processGeneratedResponse({ pool, job, rawText: result.rawText, modelId: result.modelId, providerLabel: result.providerLabel, sourceAdapter:'sdk' }).
  • Convenience: client.captureAndGenerate(event) = Phase 4 + 5 in sequence.

Doc references: ProviderObservationGenerator.ts:196-254; providers/shared/types.ts:9,23,30; ClaudeObservationProvider.ts:32,69; processGeneratedResponse.ts:62,332; generation-jobs.ts:164,390; parser.ts:105; prompt-builder.ts:46.

Verification: captureAndGenerate(...) yields one observations row whose metadata carries {title,subtitle,facts,narrative,concepts,files_*}, the job ends completed, and observation_sources links it to the source agent_event.

Anti-pattern guards: no @anthropic-ai/claude-agent-sdk, no subprocess, no queued→completed, no BullMQ payload validation/locking ceremony (:85,109-156).

Phase 6: SDK search — Chroma semantic (primary) + FTS runtime safety net + context

Chroma is required (not optional). See "Executive Decision" above. The plain-FTS branch below exists only to mirror SearchManager.ts:255's catch-and-degrade-once behavior for transient Chroma death — it is NOT a feature toggle, NOT a config-disabled path, and emits a loud logger.error so the broken state is visible.

What to implement:

  • client.search({ query, limit }) mirroring SearchManager.search's branch logic (SearchManager.ts:140,255) against Postgres:
    • Default path → queryChroma(query, limit, whereFilter) → ranked observation UUIDs → hydrate via observations.getByIdForScope / batch.
    • Empty-query path → PostgresObservationRepository.listByProject(...) (no semantic intent to express).
    • On Chroma runtime failure (and ONLY runtime failure — not config): fall back to PostgresObservationRepository.search({projectId, teamId, query, limit}) (FTS), log logger.error('CHROMA', 'semantic search failed; returning degraded FTS results — investigate uvx chroma-mcp', err), and surface { degraded: true } in the response so callers can decide whether to retry or fail their own request.
  • client.context({ query, limit }) = run search, then results.map(o => o.content).join('\n\n') (copy ServerV1PostgresRoutes.ts:886-895).
  • Chroma↔Postgres glue (the only genuinely-new code, kept minimal): reuse the storage-agnostic document layer, do not reuse SQLite-shaped syncObservation:
    • On observation persist (Phase 5), index it: build a ChromaDocument { id: observation.id /*UUID string*/, document: observation.content, metadata: { projectId, teamId, kind, serverSessionId } } and call the existing chroma_add_documents path (via ChromaMcpManager.callTool or a thin ChromaSync method that takes pre-built ChromaDocuments — refactor addDocuments from private to a reusable seam if needed, ChromaSync.ts:234).
    • Use a per-tenant collection name (e.g. cm__<projectId>), reusing ChromaSync's cm__ convention (:74).
    • Sync-on-write means no SQLite backfill/watermark path is involved (ChromaSyncState/integer IDs stay SQLite-only).

Doc references: observations.ts:153,120,133; ServerV1PostgresRoutes.ts:886-895; ChromaSync.ts:69,74,234,855; ChromaMcpManager.ts:56; SearchManager.ts:140,255; DatabaseManager.ts:26 (enable gate CLAUDE_MEM_CHROMA_ENABLED).

Verification:

  • Chroma is required at construction: with uvx/chroma-mcp deliberately unavailable, createCmemClient(...) REJECTS. (No silent-FTS-only mode.)
  • Chroma happy path: createCmemClient + captureAndGenerate + search('semantic query') returns hydrated Postgres observations ranked by semantic distance.
  • Chroma runtime-failure path: kill chroma-mcp after a successful search, run another search; results return with { degraded: true }, a logger.error('CHROMA', …) is emitted, and a subsequent createCmemClient (cold start) REJECTS.
  • context(...) returns { observations, context } with \n\n-joined content, and surfaces { degraded: true } if its underlying search degraded.

Anti-pattern guards: do not add pgvector; do not reuse syncObservation(observationId:number, …) (SQLite-shaped) for Postgres UUIDs; do not require an embeddings API key (Chroma embeds locally); do not add a chroma.enabled = false option (would re-introduce the silently-broken state the user explicitly rejected).

Phase 7: SDK public facade + types

What to implement:

  • CmemClient ties it together: capture, captureBatch, generate, captureAndGenerate, search, context, startSession, endSession, close() (closes pool + Chroma).
  • Public types: re-export PostgresObservation, the capture input type, search result/context types, and the relevant src/core/schemas Zod types. Keep the surface small and stable.
  • close() must await chromaSync?.close() (ChromaSync.ts:1096) and close/clean the pool (closePostgresPool, pool.ts:63) only if the SDK created it.

Verification: a single end-to-end test exercises createCmemClient → captureAndGenerate → search → context → close against a Postgres test DB.

Anti-pattern guard: no HTTP server, no pidfile, no process.exit, no daemon.

Phase 8: Tests + worker-free example app + docs

What to implement:

  • Unit/integration tests against a Postgres test DB (reuse the docker harness from the renamed scripts/e2e-server-docker.sh). Cover: schema bootstrap idempotency, capture, inline generation, FTS search, Chroma fallback, tenancy bootstrap.
  • examples/sdk-node/ — a plain Node (not Bun) script that imports claude-mem/sdk, points at CLAUDE_MEM_SERVER_DATABASE_URL, and runs capture→generate→search with no worker/daemon running. This is the proof of the headline requirement.
  • Docs: docs/public/ page "Using claude-mem in your app (SDK)" + update docs.json nav.

Verification: npm test green; the example runs under node (no Bun) and prints generated observations + search hits with no worker process alive.

Anti-pattern guard: the example must not start a worker or require Redis.

Phase 9: Final verification

  1. Rename complete & safe: rg -i 'server[-_]?beta' returns only intentionally-kept persisted literals (documented in 1d) and changelog/historical plan files; npm run typecheck + npm test green; CLAUDE_MEM_RUNTIME=server reaches Postgres (regression test from 1a).
  2. No forbidden deps in SDK: automated guard confirms the claude-mem/sdk import graph excludes express, bullmq, ioredis, better-auth, react, bun:sqlite, @anthropic-ai/claude-agent-sdk.
  3. Exports real: dist/index.js, dist/index.d.ts, dist/sdk/index.js, dist/sdk/index.d.ts all exist after npm run build; resolve from an external project.
  4. No invented APIs: grep the SDK for pgvector/vector(/embedding writes (should be none); confirm generation uses fetch providers, not the agent SDK; confirm parseAgentXml is always called with an active mode.
  5. Headline requirement met: the Phase 8 example demonstrates full capture → compression → semantic+FTS search in plain Node, in-process, with no HTTP worker running.

Open questions / decisions deferred to execution

  • tsup vs tsconfig.sdk.json for the SDK build (Phase 2) — pick during execution; tsup gives JS+dts in one step, tsconfig avoids a new devDep.
  • Chroma addDocuments exposure (Phase 6) — refactor the private addDocuments into a reusable seam vs. call ChromaMcpManager.callTool('chroma_add_documents') directly from the SDK. Prefer the smallest change that keeps one code path for the chroma-mcp protocol.

Correction log

  • 2026-05-29 — Plan originally framed Chroma as "optional" (lines 21, 105, Phase 3 options, Phase 6 branches, Phase 6 verification). This was wrong: claude-mem without semantic search is broken. Updated:
    • Architecture diagram + Executive Decision now mark Chroma REQUIRED.
    • createCmemClient options dropped the boolean disable; ChromaOptions is for tuning only.
    • Phase 6 default path is Chroma; FTS is a runtime safety net for transient failure that surfaces { degraded: true } and logger.error, not a feature toggle.
    • Phase 6 verification adds: createCmemClient MUST REJECT when Chroma is unavailable at construction.
    • Phase 6 anti-patterns add: no chroma.enabled = false option.
  • Tenancy persistence (Phase 3) — confirm where to store the default {teamId, projectId} (SDK state file vs. require explicit IDs in production).