doc/plans/2026-03-17-memory-service-surface-api.md
Define a Paperclip memory service and surface API that can sit above multiple memory backends, while preserving Paperclip's control-plane requirements:
This plan is based on the external landscape summarized in doc/memory-landscape.md, the AWS AgentCore comparison captured in PAP-1274, and the current Paperclip architecture in:
doc/SPEC-implementation.mddoc/plugins/PLUGIN_SPEC.mddoc/plugins/PLUGIN_AUTHORING_GUIDE.mdpackages/plugins/sdk/src/types.tsPaperclip should add a company-scoped memory control plane with company default plus agent override resolution, shared hook delivery, and full operation attribution, while leaving extraction and storage semantics to built-ins and plugins.
Every memory binding belongs to exactly one company.
Resolution order in V1:
There is no per-project override in V1.
Project context can still appear in scope and provenance so providers can use it for retrieval and partitioning, but projects do not participate in binding selection.
No cross-company memory sharing in the initial design.
Each configured memory provider gets a stable key inside a company, for example:
defaultmem0-prodlocal-markdownresearch-kbAgents, tools, and background hooks resolve the active provider by key, not by hard-coded vendor logic.
Built-ins are useful for a zero-config local path, but most providers should arrive through the existing Paperclip plugin runtime.
That keeps the core small and matches the broader Paperclip direction that specialized knowledge systems live at the edges.
Providers should not decide how Paperclip entities map to governance.
Paperclip core should own:
Paperclip should emit a common set of memory hooks that built-ins, third-party adapters, and plugins can all use.
Those hooks should pass structured Paperclip source objects plus normalized metadata. The provider then decides how to extract from those objects.
Paperclip should not force one extraction pipeline or one canonical "memory text" transform before the provider sees the input.
Automatic capture is useful, but broad silent capture is dangerous.
Initial built-in automatic hooks should be:
The hook registry itself should be general enough that other providers can subscribe to the same events without core changes.
For the open-source version, changing memory bindings should not require approvals.
Paperclip should still log those changes in activity and preserve full auditability. Approval-gated memory governance can remain an enterprise or future policy layer.
A built-in or plugin-supplied implementation that stores and retrieves memory.
Examples:
A company-scoped configuration record that points to a provider and carries provider-specific config.
This is the object selected by key.
A mapping from a Paperclip target to a binding.
V1 targets:
companyagentThe normalized Paperclip scope passed into a provider request.
At minimum:
companyIdagentIdprojectIdissueIdrunIdsubjectId for external or user identitysessionKey for providers that organize memory around sessionsnamespace for providers that need an explicit partition hintThe provenance handle that explains where a memory came from.
Supported source kinds should include:
issue_commentissue_documentissuerunactivitymanual_noteexternal_documentA normalized trigger emitted by Paperclip when something memory-relevant happens.
Initial hook kinds:
pre_run_hydratepost_run_captureissue_comment_captureissue_document_capturemanual_captureA normalized capture, record-write, query, browse, get, correction, or delete action performed through Paperclip.
Paperclip should log every memory operation whether the provider is local, plugin-backed, or external.
The required core should be small enough to fit memsearch, mem0, Memori, MemOS, or OpenViking, but strong enough to satisfy Paperclip's attribution and inspectability requirements.
export interface MemoryAdapterCapabilities {
profile?: boolean;
correction?: boolean;
multimodal?: boolean;
providerManagedExtraction?: boolean;
asyncExtraction?: boolean;
providerNativeBrowse?: boolean;
}
export interface MemoryScope {
companyId: string;
agentId?: string;
projectId?: string;
issueId?: string;
runId?: string;
subjectId?: string;
sessionKey?: string;
namespace?: string;
}
export interface MemorySourceRef {
kind:
| "issue_comment"
| "issue_document"
| "issue"
| "run"
| "activity"
| "manual_note"
| "external_document";
companyId: string;
issueId?: string;
commentId?: string;
documentKey?: string;
runId?: string;
activityId?: string;
externalRef?: string;
}
export interface MemoryHookContext {
hookKind:
| "pre_run_hydrate"
| "post_run_capture"
| "issue_comment_capture"
| "issue_document_capture"
| "manual_capture";
hookId: string;
triggeredAt: string;
actorAgentId?: string;
heartbeatRunId?: string;
}
export interface MemorySourcePayload {
text?: string;
mimeType?: string;
metadata?: Record<string, unknown>;
object?: Record<string, unknown>;
}
export interface MemoryUsage {
provider: string;
biller?: string;
model?: string;
billingType?: "metered_api" | "subscription_included" | "subscription_overage" | "unknown";
attributionMode?: "billed_directly" | "included_in_run" | "external_invoice" | "untracked";
inputTokens?: number;
cachedInputTokens?: number;
outputTokens?: number;
embeddingTokens?: number;
costCents?: number;
latencyMs?: number;
details?: Record<string, unknown>;
}
export interface MemoryRecordHandle {
providerKey: string;
providerRecordId: string;
}
export interface MemoryCaptureRequest {
bindingKey: string;
scope: MemoryScope;
source: MemorySourceRef;
payload: MemorySourcePayload;
hook?: MemoryHookContext;
mode?: "capture_residue" | "capture_record";
metadata?: Record<string, unknown>;
}
export interface MemoryRecordWriteRequest {
bindingKey: string;
scope: MemoryScope;
source?: MemorySourceRef;
records: Array<{
text: string;
summary?: string;
metadata?: Record<string, unknown>;
}>;
}
export interface MemoryQueryRequest {
bindingKey: string;
scope: MemoryScope;
query: string;
topK?: number;
intent?: "agent_preamble" | "answer" | "browse";
metadataFilter?: Record<string, unknown>;
}
export interface MemoryListRequest {
bindingKey: string;
scope: MemoryScope;
cursor?: string;
limit?: number;
metadataFilter?: Record<string, unknown>;
}
export interface MemorySnippet {
handle: MemoryRecordHandle;
text: string;
score?: number;
summary?: string;
source?: MemorySourceRef;
metadata?: Record<string, unknown>;
}
export interface MemoryContextBundle {
snippets: MemorySnippet[];
profileSummary?: string;
usage?: MemoryUsage[];
}
export interface MemoryListPage {
items: MemorySnippet[];
nextCursor?: string;
usage?: MemoryUsage[];
}
export interface MemoryExtractionJob {
providerJobId: string;
status: "queued" | "running" | "succeeded" | "failed" | "cancelled";
hookKind?: MemoryHookContext["hookKind"];
source?: MemorySourceRef;
error?: string;
submittedAt?: string;
startedAt?: string;
finishedAt?: string;
}
export interface MemoryAdapter {
key: string;
capabilities: MemoryAdapterCapabilities;
capture(req: MemoryCaptureRequest): Promise<{
records?: MemoryRecordHandle[];
jobs?: MemoryExtractionJob[];
usage?: MemoryUsage[];
}>;
upsertRecords(req: MemoryRecordWriteRequest): Promise<{
records?: MemoryRecordHandle[];
usage?: MemoryUsage[];
}>;
query(req: MemoryQueryRequest): Promise<MemoryContextBundle>;
list(req: MemoryListRequest): Promise<MemoryListPage>;
get(handle: MemoryRecordHandle, scope: MemoryScope): Promise<MemorySnippet | null>;
forget(handles: MemoryRecordHandle[], scope: MemoryScope): Promise<{ usage?: MemoryUsage[] }>;
}
This contract intentionally does not force a provider to expose its internal graph, file tree, or ontology. It does require enough structure for Paperclip to browse, attribute, and audit what happened.
These should be capability-gated, not required:
correct(handle, patch) for natural-language correction flowsprofile(scope) when the provider can synthesize stable preferences or summarieslistExtractionJobs(scope, cursor) when async extraction needs richer operator visibilityretryExtractionJob(jobId) when a provider supports re-driveexplain(queryResult) for providers that can expose retrieval tracesAWS AgentCore Memory is a useful check on whether this plan is too abstract or missing important operational surfaces.
The broad direction still looks right:
CreateMemory, UpdateMemory, ListMemories) and a data plane (CreateEvent, RetrieveMemoryRecords, GetMemoryRecord, ListMemoryRecords)That lines up with the Paperclip plan at a high level: provider configuration, scoped writes, scoped retrieval, provider-managed extraction as a capability, and a browse and inspect surface.
The concrete changes Paperclip should take from AWS are:
The rollout should preserve a clean separation between:
This keeps governance changes distinct from high-volume memory traffic.
AWS does not flatten everything into one write primitive. It distinguishes captured events from durable memory records.
Paperclip should do the same:
capture(...) for raw run, comment, document, or activity residueupsertRecords(...) for curated durable facts and notesThat is a better fit for provider-managed extraction and for manual curation flows.
AWS exposes list and retrieve surfaces directly. Paperclip should not make browse optional at the portable layer.
The minimum portable surface should include:
querylistgetProvider-native graph or file browsing can remain optional beyond that.
AWS consistently uses pagination on browse-heavy APIs.
Paperclip should add cursor-based pagination to:
Prompt hydration can continue to use topK, but operator surfaces need cursors.
AWS uses actorId, sessionId, namespace, and memoryStrategyId heavily.
Paperclip should keep its own control-plane-centric model, but the adapter contract needs obvious places to map those concepts:
sessionKeynamespaceThe provider adapter can map them to AWS or other vendor-specific identifiers without leaking those identifiers into core.
AWS exposes extraction jobs explicitly. Paperclip should too.
Operators should be able to see:
Paperclip should continue to center:
companyIdagentIdprojectIdissueIdrunIdThe lesson from AWS is to support clean mapping into provider-specific models, not to let provider identifiers take over the core product model.
Paperclip should not mirror the full provider memory corpus into Postgres unless the provider is a Paperclip-managed local provider.
Paperclip core should persist:
For external providers, the actual memory payload can remain in the provider.
Paperclip should expose one shared hook system for memory.
That same system must be available to:
Each hook delivery should include:
MemoryScopeMemorySourceRefThe payload should include structured objects where possible so the provider can decide how to extract and chunk.
These should be low-risk and easy to reason about:
pre_run_hydrate
Before an agent run starts, Paperclip may call query(... intent = "agent_preamble") using the active binding.
post_run_capture
After a run finishes, Paperclip may call capture(...) with structured run output, excerpts, and provenance.
issue_comment_capture
When enabled on the binding, Paperclip may call capture(...) for selected issue comments.
issue_document_capture
When enabled on the binding, Paperclip may call capture(...) for selected issue documents.
These should be tool-driven or UI-driven first:
memory.searchmemory.notememory.forgetmemory.correctPaperclip should give agents both automatic recall and explicit tools, with simple guidance:
memory.search when the task depends on prior decisions, people, projects, or long-running context that is not in the current issue threadmemory.note when a durable fact, preference, or decision should survive this runmemory.correct when the user explicitly says prior context is wrongThis keeps memory available without forcing every agent prompt to become a memory-management protocol.
Paperclip needs a first-class UI for memory, otherwise providers become black boxes.
The initial browse surface should support:
When a provider supports richer browsing, the plugin can add deeper views through the existing plugin UI surfaces.
Paperclip should treat memory accounting as two related but distinct concerns:
memory_operations is the authoritative audit trailEvery memory action should create a normalized operation record that captures:
This is where operators answer "what memory work happened and why?"
cost_events remains the canonical spend ledger for billable metered usageThe current cost_events model is already the canonical cost ledger for token and model spend, and agent_runtime_state plus heartbeat_runs.usageJson already roll up and summarize run usage.
The recommendation is:
cost_eventattributionMode = "included_in_run" and link it to the related heartbeatRunIdcost_eventcost_event should still link back to the memory operation, agent, company, and issue or run context when possiblefinance_events should carry flat subscription or invoice-style costsIf a memory service incurs:
those should be represented as finance_events, not as synthetic per-query memory operations.
That keeps usage telemetry separate from accounting entries like invoices and flat fees.
Paperclip should record evaluation-oriented metrics where possible:
This is important because a memory system that "works" but silently burns budget or silently fails extraction is not acceptable in Paperclip.
At the control-plane level, the likely new core tables are:
memory_bindings
memory_binding_targets
company, agent)memory_operations
capture, record_upsert, query, list, get, forget, correct)memory_extraction_jobs
Provider-specific long-form state should stay in plugin state or the provider itself unless a built-in local provider needs its own schema.
The best zero-config built-in is a local markdown-first provider with optional semantic indexing.
Why:
The design should still treat that built-in as just another provider behind the same control-plane contract.
memory_operations for debugging without duplicating large transcripts?The right abstraction is:
That gives Paperclip a stable memory service without locking the product to one memory philosophy or one vendor, and it integrates the AWS lessons without importing AWS's model into core.