Back to Mem0

OpenAI Compatibility

docs/open-source/features/openai_compatibility.mdx

2.0.14.7 KB
Original Source

Mem0 mirrors the OpenAI client interface so you can plug memories into existing chat-completion code with minimal changes. Point your OpenAI-compatible client at Mem0, keep the same request shape, and gain persistent memory between calls.

<Info> **You’ll use this when…** - Your app already relies on OpenAI chat completions and you want Mem0 to feel familiar. - You need to reuse existing middleware that expects OpenAI-compatible responses. - You plan to switch between Mem0 Platform and the self-hosted client without rewriting code. </Info>

Feature

  • Drop-in client: client.chat.completions.create(...) works the same as OpenAI’s method signatures.
  • Shared parameters: Mem0 accepts messages, model, and optional memory-scoping fields (user_id, agent_id, run_id).
  • Memory-aware responses: Each call saves relevant facts so future prompts automatically reflect past conversations.
  • OSS parity: Use the same API surface whether you call the hosted proxy or the OSS configuration.
<Info icon="check"> Run one request with `user_id` set. If the next call references that ID and its reply uses the stored memory, compatibility is confirmed. </Info>

Configure it

Call the managed Mem0 proxy

python
from mem0.proxy.main import Mem0

client = Mem0(api_key="m0-xxx")

messages = [
    {"role": "user", "content": "I love Indian food but I cannot eat pizza since I'm allergic to cheese."}
]

chat_completion = client.chat.completions.create(
    messages=messages,
    model="gpt-5-mini",
    user_id="alice"
)
<Tip> Reuse the same identifiers your OpenAI client already sends so you can switch between providers without branching logic. </Tip>

Use the OpenAI-compatible OSS client

python
from mem0.proxy.main import Mem0

config = {
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "host": "localhost",
            "port": 6333
        }
    }
}

client = Mem0(config=config)

chat_completion = client.chat.completions.create(
    messages=[{"role": "user", "content": "What's the capital of France?"}],
    model="gpt-5-mini"
)

See it in action

Memory-aware restaurant recommendation

python
from mem0.proxy.main import Mem0

client = Mem0(api_key="m0-xxx")

# Store preferences
client.chat.completions.create(
    messages=[{"role": "user", "content": "I love Indian food but I'm allergic to cheese."}],
    model="gpt-5-mini",
    user_id="alice"
)

# Later conversation reuses the memory
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Suggest dinner options in San Francisco."}],
    model="gpt-5-mini",
    user_id="alice"
)

print(response.choices[0].message.content)
<Info icon="check"> The second response should call out Indian restaurants and avoid cheese, proving Mem0 recalled the stored preference. </Info>

Verify the feature is working

  • Compare responses from Mem0 vs. OpenAI for identical prompts—both should return the same structure (choices, usage, etc.).
  • Inspect stored memories after each request to confirm the fact extraction captured the right details.
  • Test switching between hosted (Mem0(api_key=...)) and OSS configurations to ensure both respect the same request body.

Best practices

  1. Scope context intentionally: Pass identifiers only when you want conversations to persist; skip them for one-off calls.
  2. Log memory usage: Inspect response.metadata.memories (if enabled) to see which facts the model recalled.
  3. Reuse middleware: Point your existing OpenAI client wrappers to the Mem0 proxy URL to avoid code drift.
  4. Handle fallbacks: Keep a code path for plain OpenAI calls in case Mem0 is unavailable, then resync memory later.

Parameter reference

ParameterTypePurpose
user_idstrAssociates the conversation with a user so memories persist.
agent_idstrOptional agent or bot identifier for multi-agent scenarios.
run_idstrOptional session/run identifier for short-lived flows.
metadatadictStore extra fields alongside each memory entry.
filtersdictRestrict retrieval to specific memories while responding.
top_kintCap how many memories Mem0 pulls into the context (default 10).

Other request fields mirror OpenAI’s chat completion API.


<CardGroup cols={2}> <Card title="Connect Vision Models" icon="circle-dot" href="/components/llms/models/openai"> Review LLM options that support OpenAI-compatible calls in Mem0. </Card> <Card title="Automate OpenAI Tool Calls" icon="plug" href="/cookbooks/integrations/openai-tool-calls"> See a full workflow that layers Mem0 memories on top of tool-calling agents. </Card> </CardGroup>