docs/src/content/en/reference/memory/observational-memory.mdx
Added in: @mastra/[email protected]
Observational Memory (OM) is Mastra's memory system for long-context agentic memory. Two background agents — an Observer that watches conversations and creates observations, and a Reflector that restructures observations by combining related items, reflecting on overarching patterns, and condensing where possible — maintain an observation log that replaces raw message history as it grows.
import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5-mini',
memory: new Memory({
options: {
observationalMemory: true,
},
}),
})
The observationalMemory option accepts true, a configuration object, or false. Setting true enables OM with google/gemini-2.5-flash as the default model. When passing a config object, a model must be explicitly set — either at the top level, or on observation.model and/or reflection.model.
Observer input is multimodal-aware. OM keeps text placeholders like [Image #1: screenshot.png] in the transcript it builds for the Observer, and also sends the underlying image parts when possible. This applies to both single-thread observation and batched multi-thread observation. Non-image files appear as placeholders only.
OM performs thresholding with fast local token estimation. Text uses tokenx, and image-like inputs use provider-aware heuristics plus deterministic fallbacks when metadata is incomplete.
<PropertiesTable
content={[
{
name: 'enabled',
type: 'boolean',
description:
'Enable or disable Observational Memory. When omitted from a config object, defaults to true. Only enabled: false explicitly disables it.',
isOptional: true,
defaultValue: 'true',
},
{
name: 'model',
type: 'string | LanguageModel | DynamicModel | ModelWithRetries[]',
description:
'Model for both the Observer and Reflector agents. Sets the model for both at once. Cannot be used together with observation.model or reflection.model — an error will be thrown if both are set. When using observationalMemory: true, defaults to google/gemini-2.5-flash. When passing a config object, this or observation.model/reflection.model must be set. Use "default" to explicitly use the default model (google/gemini-2.5-flash).',
isOptional: true,
defaultValue: "'google/gemini-2.5-flash' (when using observationalMemory: true)",
},
{
name: 'scope',
type: "'resource' | 'thread'",
description:
"Memory scope for observations. 'thread' keeps observations per-thread. 'resource' (experimental) shares observations across all threads for a resource, enabling cross-conversation memory.",
isOptional: true,
defaultValue: "'thread'",
},
{
name: 'shareTokenBudget',
type: 'boolean',
description:
'Share the token budget between messages and observations. When enabled, the total budget is observation.messageTokens + reflection.observationTokens. Messages can use more space when observations are small, and vice versa. This maximizes context usage through flexible allocation. shareTokenBudget is not yet compatible with async buffering. You must set observation: { bufferTokens: false } when using this option (this is a temporary limitation).',
isOptional: true,
defaultValue: 'false',
},
{
name: 'retrieval',
type: 'boolean',
description:
"Experimental. Enable retrieval-mode observation groups as durable pointers to raw message history. Retrieval mode is only active when scope is 'thread'. If you set retrieval: true with scope: 'resource', OM keeps resource-scoped memory behavior but skips retrieval-mode context and does not register the recall tool.",
isOptional: true,
defaultValue: 'false',
},
{
name: 'observation',
type: 'ObservationalMemoryObservationConfig',
description: 'Configuration for the observation step. Controls when the Observer agent runs and how it behaves.',
isOptional: true,
properties: [
{
type: 'ObservationalMemoryObservationConfig',
parameters: [
{
name: 'model',
type: 'string | LanguageModel | DynamicModel | ModelWithRetries[]',
description:
'Model for the Observer agent. Cannot be set if a top-level model is also provided. If neither this nor the top-level model is set, falls back to reflection.model.',
isOptional: true,
},
{
name: 'instruction',
type: 'string',
description:
"Custom instruction appended to the Observer's system prompt. Use this to customize what the Observer focuses on, such as domain-specific preferences or priorities.",
isOptional: true,
},
{
name: 'threadTitle',
type: 'boolean',
description:
'When true, the Observer suggests short thread titles and updates the thread title when the conversation topic meaningfully changes. This is opt-in and defaults to disabled.',
isOptional: true,
defaultValue: 'false',
},
{
name: 'messageTokens',
type: 'number',
description:
'Token count of unobserved messages that triggers observation. When unobserved message tokens exceed this threshold, the Observer agent is called. Text is estimated locally with tokenx. Image parts are included with model-aware heuristics when possible, with deterministic fallbacks when image metadata is incomplete. Image-like file parts are counted the same way when uploads are normalized as files.',
isOptional: true,
defaultValue: '30000',
},
{
name: 'maxTokensPerBatch',
type: 'number',
description:
'Maximum tokens per batch when observing multiple threads in resource scope. Threads are chunked into batches of this size and processed in parallel. Lower values mean more parallelism but more API calls.',
isOptional: true,
defaultValue: '10000',
},
{
name: 'modelSettings',
type: 'ObservationalMemoryModelSettings',
description: 'Model settings for the Observer agent.',
isOptional: true,
defaultValue: '{ temperature: 0.3, maxOutputTokens: 100_000 }',
properties: [
{
type: 'ObservationalMemoryModelSettings',
parameters: [
{
name: 'temperature',
type: 'number',
description: 'Temperature for generation. Lower values produce more consistent output.',
isOptional: true,
defaultValue: '0.3',
},
{
name: 'maxOutputTokens',
type: 'number',
description: 'Maximum output tokens. Set high to prevent truncation of observations.',
isOptional: true,
defaultValue: '100000',
},
],
},
],
},
{
name: 'bufferTokens',
type: 'number | false',
description:
'Token interval for async background observation buffering. Can be an absolute token count (e.g. 5000) or a fraction of messageTokens (e.g. 0.25 = buffer every 25% of threshold). When set, observations run in the background at this interval, storing results in a buffer. When the main messageTokens threshold is reached, buffered observations activate instantly without a blocking LLM call. Must resolve to less than messageTokens. Set to false to explicitly disable all async buffering (both observation and reflection).',
isOptional: true,
defaultValue: '0.2',
},
{
name: 'bufferActivation',
type: 'number',
description:
'Controls how much of the message window to retain after activation. Accepts a ratio (0-1) or an absolute token count (≥ 1000). For example, 0.8 means: activate enough buffers to remove 80% of messageTokens and leave 20% as active message history. An absolute token count like 4000 targets a goal of keeping ~4k message tokens remaining after activation. Higher values remove more message history per activation when using a ratio. Higher values keep more message history when using a token count.',
isOptional: true,
defaultValue: '0.8',
},
{
name: 'blockAfter',
type: 'number',
description:
'Token threshold above which synchronous (blocking) observation is forced. Between messageTokens and blockAfter, only async buffering/activation is used. Above blockAfter, a synchronous observation runs as a last resort, while buffered activation still preserves a minimum remaining context (min(1000, retention floor)). Accepts a multiplier (1 < value < 2, multiplied by messageTokens) or an absolute token count (≥ 2, must be greater than messageTokens). Only relevant when bufferTokens is set. Defaults to 1.2 when async buffering is enabled.',
isOptional: true,
defaultValue: '1.2 (when bufferTokens is set)',
},
{
name: 'previousObserverTokens',
type: 'number | false',
description:
"Optional token budget for the observer's previous-observations context. When set to a number, the observations passed to the Observer agent are tail-truncated to fit within this budget while keeping the newest observations and preserving highlighted 🔴 items when possible. When a buffered reflection is pending, the already-reflected observation lines are automatically replaced with the reflection summary before truncation. Set to 0 to omit previous observations entirely, or false to disable truncation explicitly.",
isOptional: true,
defaultValue: '2000',
},
],
},
],
},
{
name: 'reflection',
type: 'ObservationalMemoryReflectionConfig',
description: 'Configuration for the reflection step. Controls when the Reflector agent runs and how it behaves.',
isOptional: true,
properties: [
{
type: 'ObservationalMemoryReflectionConfig',
parameters: [
{
name: 'model',
type: 'string | LanguageModel | DynamicModel | ModelWithRetries[]',
description:
'Model for the Reflector agent. Cannot be set if a top-level model is also provided. If neither this nor the top-level model is set, falls back to observation.model.',
isOptional: true,
},
{
name: 'instruction',
type: 'string',
description:
"Custom instruction appended to the Reflector's system prompt. Use this to customize how the Reflector consolidates observations, such as prioritizing certain types of information.",
isOptional: true,
},
{
name: 'observationTokens',
type: 'number',
description:
'Token count of observations that triggers reflection. When observation tokens exceed this threshold, the Reflector agent is called to condense them.',
isOptional: true,
defaultValue: '40000',
},
{
name: 'modelSettings',
type: 'ObservationalMemoryModelSettings',
description: 'Model settings for the Reflector agent.',
isOptional: true,
defaultValue: '{ temperature: 0, maxOutputTokens: 100_000 }',
properties: [
{
type: 'ObservationalMemoryModelSettings',
parameters: [
{
name: 'temperature',
type: 'number',
description: 'Temperature for generation. Lower values produce more consistent output.',
isOptional: true,
defaultValue: '0',
},
{
name: 'maxOutputTokens',
type: 'number',
description: 'Maximum output tokens. Set high to prevent truncation of observations.',
isOptional: true,
defaultValue: '100000',
},
],
},
],
},
{
name: 'bufferActivation',
type: 'number',
description:
'Ratio (0-1) controlling when async reflection buffering starts. When observation tokens reach observationTokens * bufferActivation, reflection runs in the background. On activation at the full threshold, the buffered reflection replaces the observations it covers, preserving any new observations appended after that range.',
isOptional: true,
defaultValue: '0.5',
},
{
name: 'blockAfter',
type: 'number',
description:
'Token threshold above which synchronous (blocking) reflection is forced. Between observationTokens and blockAfter, only async buffering/activation is used. Above blockAfter, a synchronous reflection runs as a last resort. Accepts a multiplier (1 < value < 2, multiplied by observationTokens) or an absolute token count (≥ 2, must be greater than observationTokens). Only relevant when bufferActivation is set. Defaults to 1.2 when async reflection is enabled.',
isOptional: true,
defaultValue: '1.2 (when bufferActivation is set)',
},
],
},
],
},
]}
/>
OM persists token payload estimates so repeated counting can reuse prior token estimation work.
part.providerMetadata.mastra.data-* and reasoning parts are skipped and don't receive cache entries.import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5-mini',
memory: new Memory({
options: {
observationalMemory: {
model: 'google/gemini-2.5-flash',
scope: 'resource',
observation: {
messageTokens: 20_000,
},
reflection: {
observationTokens: 60_000,
},
},
},
}),
})
When shareTokenBudget is enabled, the total budget is observation.messageTokens + reflection.observationTokens (100k in this example). If observations only use 30k tokens, messages can expand to use up to 70k. If messages are short, observations have more room before triggering reflection.
import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5-mini',
memory: new Memory({
options: {
observationalMemory: {
shareTokenBudget: true,
observation: {
messageTokens: 20_000,
bufferTokens: false, // required when using shareTokenBudget (temporary limitation)
},
reflection: {
observationTokens: 80_000,
},
},
},
}),
})
By passing a model in the config, you can use any model from Mastra's model router.
import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5.4',
memory: new Memory({
options: {
observationalMemory: {
// highlight-next-line
model: 'openai/gpt-5-mini',
},
},
}),
})
import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5.4',
memory: new Memory({
options: {
observationalMemory: {
// highlight-start
observation: {
model: 'google/gemini-2.5-flash',
},
reflection: {
model: 'openai/gpt-5-mini',
},
// highlight-end
},
},
}),
})
Customize what the Observer and Reflector focus on by providing custom instructions:
import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'health-assistant',
instructions: 'You are a health and wellness assistant.',
model: 'openai/gpt-5.4',
memory: new Memory({
options: {
observationalMemory: {
model: 'google/gemini-2.5-flash',
observation: {
// Focus observations on health-related preferences and goals
instruction:
'Prioritize capturing user health goals, dietary restrictions, exercise preferences, and medical considerations. Avoid capturing general chit-chat.',
},
reflection: {
// Guide reflection to consolidate health patterns
instruction:
'When consolidating, group related health information together. Preserve specific metrics, dates, and medical details.',
},
},
},
}),
})
Async buffering is enabled by default. It pre-computes observations in the background as the conversation grows — when the messageTokens threshold is reached, buffered observations activate instantly with no blocking LLM call.
The lifecycle is: buffer → activate → remove messages → repeat. Background Observer calls run at bufferTokens intervals, each producing a chunk of observations. At threshold, chunks activate: observations move into the log, raw messages are removed from context. The blockAfter threshold forces a synchronous fallback if buffering can't keep up.
Default settings:
observation.bufferTokens: 0.2 — buffer every 20% of messageTokens (e.g. every ~6k tokens with a 30k threshold)observation.bufferActivation: 0.8 — on activation, remove enough messages to keep only 20% of the threshold remainingsuggestedResponse, currentTask) that survive activation to maintain conversational continuityreflection.bufferActivation: 0.5 — start background reflection at 50% of observation thresholdTo customize:
import { Memory } from '@mastra/memory'
import { Agent } from '@mastra/core/agent'
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5-mini',
memory: new Memory({
options: {
observationalMemory: {
model: 'google/gemini-2.5-flash',
observation: {
messageTokens: 30_000,
// Buffer every 5k tokens (runs in background)
bufferTokens: 5_000,
// Activate to retain 30% of threshold
bufferActivation: 0.7,
// Force synchronous observation at 1.5x threshold
blockAfter: 1.5,
},
reflection: {
observationTokens: 60_000,
// Start background reflection at 50% of threshold
bufferActivation: 0.5,
// Force synchronous reflection at 1.2x threshold
blockAfter: 1.2,
},
},
},
}),
})
To disable async buffering entirely:
observationalMemory: {
model: "google/gemini-2.5-flash",
observation: {
bufferTokens: false,
},
}
Setting bufferTokens: false disables both observation and reflection async buffering. Observations and reflections will run synchronously when their thresholds are reached.
:::note
Async buffering isn't supported with scope: 'resource' and is automatically disabled in resource scope.
:::
Observational Memory emits typed data parts during agent execution that clients can use for real-time UI feedback. These are streamed alongside the agent's response.
data-om-statusEmitted once per agent loop step, before model generation. Provides a snapshot of the current memory state, including token usage for both context windows and the state of any async buffered content.
interface DataOmStatusPart {
type: 'data-om-status'
data: {
windows: {
active: {
/** Unobserved message tokens and the threshold that triggers observation */
messages: { tokens: number; threshold: number }
/** Observation tokens and the threshold that triggers reflection */
observations: { tokens: number; threshold: number }
}
buffered: {
observations: {
/** Number of buffered chunks staged for activation */
chunks: number
/** Total message tokens across all buffered chunks */
messageTokens: number
/** Projected message tokens that would be removed if activation happened now (based on bufferActivation ratio and chunk boundaries) */
projectedMessageRemoval: number
/** Observation tokens that will be added on activation */
observationTokens: number
/** idle: no buffering in progress. running: background observer is working. complete: chunks are ready for activation. */
status: 'idle' | 'running' | 'complete'
}
reflection: {
/** Observation tokens that were fed into the reflector (pre-compression size) */
inputObservationTokens: number
/** Observation tokens the reflection will produce on activation (post-compression size) */
observationTokens: number
/** idle: no reflection buffered. running: background reflector is working. complete: reflection is ready for activation. */
status: 'idle' | 'running' | 'complete'
}
}
}
recordId: string
threadId: string
stepNumber: number
/** Increments each time the Reflector creates a new generation */
generationCount: number
}
}
buffered.reflection.inputObservationTokens is the size of the observations that were sent to the Reflector. buffered.reflection.observationTokens is the compressed result — the size of what will replace those observations when the reflection activates. A client can use these two values to show a compression ratio.
Clients can derive percentages and post-activation estimates from the raw values:
// Message window usage %
const msgPercent = status.windows.active.messages.tokens / status.windows.active.messages.threshold
// Observation window usage %
const obsPercent =
status.windows.active.observations.tokens / status.windows.active.observations.threshold
// Projected message tokens after buffered observations activate
// Uses projectedMessageRemoval which accounts for bufferActivation ratio and chunk boundaries
const postActivation =
status.windows.active.messages.tokens -
status.windows.buffered.observations.projectedMessageRemoval
// Reflection compression ratio (when buffered reflection exists)
const { inputObservationTokens, observationTokens } = status.windows.buffered.reflection
if (inputObservationTokens > 0) {
const compressionRatio = observationTokens / inputObservationTokens
}
data-om-observation-startEmitted when the Observer or Reflector agent begins processing.
<PropertiesTable
content={[
{
name: 'cycleId',
type: 'string',
description: 'Unique ID for this cycle — shared between start/end/failed markers.',
},
{
name: 'operationType',
type: "'observation' | 'reflection'",
description: 'Whether this is an observation or reflection operation.',
},
{ name: 'startedAt', type: 'string', description: 'ISO timestamp when processing started.' },
{ name: 'tokensToObserve', type: 'number', description: 'Message tokens (input) being processed in this batch.' },
{ name: 'recordId', type: 'string', description: 'The OM record ID.' },
{ name: 'threadId', type: 'string', description: "This thread's ID." },
{ name: 'threadIds', type: 'string[]', description: 'All thread IDs in this batch (for resource-scoped).' },
{
name: 'config',
type: 'ObservationMarkerConfig',
description: 'Snapshot of messageTokens, observationTokens, and scope at observation time.',
},
]}
/>
data-om-observation-endEmitted when observation or reflection completes successfully.
<PropertiesTable
content={[
{ name: 'cycleId', type: 'string', description: 'Matches the corresponding start marker.' },
{ name: 'operationType', type: "'observation' | 'reflection'", description: 'Type of operation that completed.' },
{ name: 'completedAt', type: 'string', description: 'ISO timestamp when processing completed.' },
{ name: 'durationMs', type: 'number', description: 'Duration in milliseconds.' },
{ name: 'tokensObserved', type: 'number', description: 'Message tokens (input) that were processed.' },
{
name: 'observationTokens',
type: 'number',
description: 'Resulting observation tokens (output) after the Observer compressed them.',
},
{ name: 'observations', type: 'string', description: 'The generated observations text.', isOptional: true },
{ name: 'currentTask', type: 'string', description: 'Current task extracted by the Observer.', isOptional: true },
{
name: 'suggestedResponse',
type: 'string',
description: 'Suggested response extracted by the Observer.',
isOptional: true,
},
{ name: 'recordId', type: 'string', description: 'The OM record ID.' },
{ name: 'threadId', type: 'string', description: "This thread's ID." },
]}
/>
data-om-observation-failedEmitted when observation or reflection fails. The system falls back to synchronous processing.
<PropertiesTable
content={[
{ name: 'cycleId', type: 'string', description: 'Matches the corresponding start marker.' },
{ name: 'operationType', type: "'observation' | 'reflection'", description: 'Type of operation that failed.' },
{ name: 'failedAt', type: 'string', description: 'ISO timestamp when the failure occurred.' },
{ name: 'durationMs', type: 'number', description: 'Duration until failure in milliseconds.' },
{ name: 'tokensAttempted', type: 'number', description: 'Message tokens (input) that were attempted.' },
{ name: 'error', type: 'string', description: 'Error message.' },
{ name: 'observations', type: 'string', description: 'Any partial content available for display.', isOptional: true },
{ name: 'recordId', type: 'string', description: 'The OM record ID.' },
{ name: 'threadId', type: 'string', description: "This thread's ID." },
]}
/>
data-om-buffering-startEmitted when async buffering begins in the background. Buffering pre-computes observations or reflections before the main threshold is reached.
<PropertiesTable content={[ { name: 'cycleId', type: 'string', description: 'Unique ID for this buffering cycle.' }, { name: 'operationType', type: "'observation' | 'reflection'", description: 'Type of operation being buffered.' }, { name: 'startedAt', type: 'string', description: 'ISO timestamp when buffering started.' }, { name: 'tokensToBuffer', type: 'number', description: 'Message tokens (input) being buffered in this cycle.' }, { name: 'recordId', type: 'string', description: 'The OM record ID.' }, { name: 'threadId', type: 'string', description: "This thread's ID." }, { name: 'threadIds', type: 'string[]', description: 'All thread IDs being buffered (for resource-scoped).' }, { name: 'config', type: 'ObservationMarkerConfig', description: 'Snapshot of config at buffering time.' }, ]} />
data-om-buffering-endEmitted when async buffering completes. The content is stored but not yet activated in the main context.
<PropertiesTable
content={[
{ name: 'cycleId', type: 'string', description: 'Matches the corresponding buffering-start marker.' },
{ name: 'operationType', type: "'observation' | 'reflection'", description: 'Type of operation that was buffered.' },
{ name: 'completedAt', type: 'string', description: 'ISO timestamp when buffering completed.' },
{ name: 'durationMs', type: 'number', description: 'Duration in milliseconds.' },
{ name: 'tokensBuffered', type: 'number', description: 'Message tokens (input) that were buffered.' },
{
name: 'bufferedTokens',
type: 'number',
description: 'Observation tokens (output) after the Observer compressed them.',
},
{ name: 'observations', type: 'string', description: 'The buffered content.', isOptional: true },
{ name: 'recordId', type: 'string', description: 'The OM record ID.' },
{ name: 'threadId', type: 'string', description: "This thread's ID." },
]}
/>
data-om-buffering-failedEmitted when async buffering fails. The system falls back to synchronous processing when the threshold is reached.
<PropertiesTable
content={[
{ name: 'cycleId', type: 'string', description: 'Matches the corresponding buffering-start marker.' },
{ name: 'operationType', type: "'observation' | 'reflection'", description: 'Type of operation that failed.' },
{ name: 'failedAt', type: 'string', description: 'ISO timestamp when the failure occurred.' },
{ name: 'durationMs', type: 'number', description: 'Duration until failure in milliseconds.' },
{ name: 'tokensAttempted', type: 'number', description: 'Message tokens (input) that were attempted to buffer.' },
{ name: 'error', type: 'string', description: 'Error message.' },
{ name: 'observations', type: 'string', description: 'Any partial content.', isOptional: true },
{ name: 'recordId', type: 'string', description: 'The OM record ID.' },
{ name: 'threadId', type: 'string', description: "This thread's ID." },
]}
/>
data-om-activationEmitted when buffered observations or reflections are activated (moved into the active context window). This is an instant operation — no LLM call is involved.
<PropertiesTable content={[ { name: 'cycleId', type: 'string', description: 'Unique ID for this activation event.' }, { name: 'operationType', type: "'observation' | 'reflection'", description: 'Type of content activated.' }, { name: 'activatedAt', type: 'string', description: 'ISO timestamp when activation occurred.' }, { name: 'chunksActivated', type: 'number', description: 'Number of buffered chunks activated.' }, { name: 'tokensActivated', type: 'number', description: 'Message tokens (input) from activated chunks. For observation activation, these are removed from the message window. For reflection activation, this is the observation tokens that were compressed.', }, { name: 'observationTokens', type: 'number', description: 'Resulting observation tokens after activation.' }, { name: 'messagesActivated', type: 'number', description: 'Number of messages that were observed via activation.' }, { name: 'generationCount', type: 'number', description: 'Current reflection generation count.' }, { name: 'observations', type: 'string', description: 'The activated observations text.', isOptional: true }, { name: 'recordId', type: 'string', description: 'The OM record ID.' }, { name: 'threadId', type: 'string', description: "This thread's ID." }, { name: 'config', type: 'ObservationMarkerConfig', description: 'Snapshot of config at activation time.' }, ]} />
Most users should use the Memory class above. Using ObservationalMemory directly is mainly useful for benchmarking, experimentation, or when you need to control processor ordering with other processors (like guardrails).
import { ObservationalMemory } from '@mastra/memory/processors'
import { Agent } from '@mastra/core/agent'
import { LibSQLStore } from '@mastra/libsql'
const storage = new LibSQLStore({
id: 'my-storage',
url: 'file:./memory.db',
})
const om = new ObservationalMemory({
storage: storage.stores.memory,
model: 'google/gemini-2.5-flash',
scope: 'resource',
observation: {
messageTokens: 20_000,
},
reflection: {
observationTokens: 60_000,
},
})
export const agent = new Agent({
name: 'my-agent',
instructions: 'You are a helpful assistant.',
model: 'openai/gpt-5-mini',
inputProcessors: [om],
outputProcessors: [om],
})
The standalone ObservationalMemory class accepts all the same options as the observationalMemory config object above, plus the following:
<PropertiesTable
content={[
{
name: 'storage',
type: 'MemoryStorage',
description:
'Storage adapter for persisting observations. Must be a MemoryStorage instance (from MastraStorage.stores.memory).',
isOptional: false,
},
{
name: 'onDebugEvent',
type: '(event: ObservationDebugEvent) => void',
description:
'Debug callback for observation events. Called whenever observation-related events occur. Useful for debugging and understanding the observation flow.',
isOptional: true,
},
{
name: 'obscureThreadIds',
type: 'boolean',
description:
'When enabled, thread IDs are hashed before being included in observation context. This prevents the LLM from recognizing patterns in thread identifiers. Automatically enabled when using resource scope through the Memory class.',
isOptional: true,
defaultValue: 'false',
},
]}
/>
When retrieval: true is set with scope: 'thread', OM registers a recall tool that the agent can call to page through the raw messages behind an observation group's _range. The tool is automatically added to the agent's tool list — no manual registration is needed.
<PropertiesTable
content={[
{
name: 'cursor',
type: 'string',
isOptional: false,
description:
'A message ID to anchor the recall query. Extract the start or end ID from an observation group range (e.g. from _range: \\startId:endId\_, use either startId or endId). If a range string is passed directly, the tool returns a hint explaining how to extract the correct ID.',
},
{
name: 'page',
type: 'number',
isOptional: true,
defaultValue: '1',
description:
'Pagination offset from the cursor. Positive values page forward (messages after the cursor), negative values page backward (messages before the cursor). 0 is treated as 1.',
},
{
name: 'limit',
type: 'number',
isOptional: true,
defaultValue: '20',
description: 'Maximum number of messages per page.',
},
{
name: 'detail',
type: "'low' | 'high'",
isOptional: true,
defaultValue: "'low'",
description:
"Controls how much content is shown per message part. 'low' shows truncated text and tool names with positional indices ([p0], [p1]). 'high' shows full content including tool arguments and results, clamped to one part per call with continuation hints.",
},
{
name: 'partIndex',
type: 'number',
isOptional: true,
description:
'Fetch a single message part at full detail by its positional index. Use this when a low-detail recall shows an interesting part at [p1] — call again with partIndex: 1 to see the full content without loading every part.',
},
]}
/>
<PropertiesTable
content={[
{
name: 'messages',
type: 'string',
description: 'Formatted message content. Format depends on the detail level.',
},
{
name: 'count',
type: 'number',
description: 'Number of messages in this page.',
},
{
name: 'cursor',
type: 'string',
description: 'The cursor message ID used for this query.',
},
{
name: 'page',
type: 'number',
description: 'The page number returned.',
},
{
name: 'limit',
type: 'number',
description: 'The limit used for this query.',
},
{
name: 'hasNextPage',
type: 'boolean',
description: 'Whether more messages exist after this page.',
},
{
name: 'hasPrevPage',
type: 'boolean',
description: 'Whether more messages exist before this page.',
},
{
name: 'truncated',
type: 'boolean',
isOptional: true,
description:
'Present and true when the output was capped by the token budget. The agent can paginate or use partIndex to access remaining content.',
},
{
name: 'tokenOffset',
type: 'number',
isOptional: true,
description: 'Approximate number of tokens that were trimmed when truncated is true.',
},
]}
/>