Back to Llama Index

Chat Memory Buffer

docs/examples/agent/memory/chat_memory_buffer.ipynb

0.14.211.7 KB
Original Source

Chat Memory Buffer

NOTE: This example of memory is deprecated in favor of the newer and more flexible Memory class. See the latest docs.

The ChatMemoryBuffer is a memory buffer that simply stores the last X messages that fit into a token limit.

%pip install llama-index-core

Setup

python
from llama_index.core.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=40000)

Using Standalone

python
from llama_index.core.llms import ChatMessage

chat_history = [
    ChatMessage(role="user", content="Hello, how are you?"),
    ChatMessage(role="assistant", content="I'm doing well, thank you!"),
]

# put a list of messages
memory.put_messages(chat_history)

# put one message at a time
# memory.put_message(chat_history[0])
python
# Get the last X messages that fit into a token limit
history = memory.get()
python
# Get all messages
all_history = memory.get_all()
python
# clear the memory
memory.reset()

Using with Agents

You can set the memory in any agent in the .run() method.

python
import os

os.environ["OPENAI_API_KEY"] = "sk-proj-..."
python
from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI


memory = ChatMemoryBuffer.from_defaults(token_limit=40000)

agent = FunctionAgent(tools=[], llm=OpenAI(model="gpt-4o-mini"))

# context to hold the chat history/state
ctx = Context(agent)
python
resp = await agent.run("Hello, how are you?", ctx=ctx, memory=memory)
python
print(memory.get_all())