Back to Omi

Storing Conversations & Memories

docs/doc/developer/backend/StoringConversations.mdx

3.0.0-Android-App13.6 KB
Original Source

Overview

Omi uses a dual-collection architecture for storing user data:

<CardGroup cols={2}> <Card title="Conversations" icon="comments"> **Primary storage** for recorded interactions - transcripts, audio, structured summaries </Card> <Card title="Memories" icon="brain"> **Secondary storage** for extracted facts/learnings FROM conversations </Card> </CardGroup>

This separation allows for efficient retrieval of both full conversation context and quick access to key facts about the user.

Architecture Diagram

mermaid
flowchart TD
    subgraph Recording["šŸ“± Recording"]
        R[User Recording] --> AS[Audio Stream]
        AS --> T[Transcription
Deepgram]
    end

    T --> POST[POST /v1/conversations]
    POST --> PC[process_conversation]

    PC --> GS[_get_structured
title, overview,
action_items, events]
    PC --> EM[_extract_memories
facts → memories]
    PC --> SAI[_save_action_items
standalone collection]

    GS --> UC[upsert_conversation]
    EM --> SM[save_memories]

    UC --> FS[(Firestore:
conversations/)]
    SM --> FSM[(Firestore:
memories/)]
    UC --> Pine[(Pinecone:
vectors)]

Firestore Structure

users/
ā”œā”€ā”€ {uid}/
│   ā”œā”€ā”€ conversations/                    # PRIMARY - Recorded interactions
│   │   └── {conversation_id}/
│   │       ā”œā”€ā”€ id
│   │       ā”œā”€ā”€ created_at
│   │       ā”œā”€ā”€ started_at
│   │       ā”œā”€ā”€ finished_at
│   │       ā”œā”€ā”€ source
│   │       ā”œā”€ā”€ language
│   │       ā”œā”€ā”€ status
│   │       ā”œā”€ā”€ structured
│   │       ā”œā”€ā”€ transcript_segments
│   │       ā”œā”€ā”€ geolocation
│   │       ā”œā”€ā”€ photos/ (subcollection)
│   │       ā”œā”€ā”€ audio_files
│   │       ā”œā”€ā”€ apps_results
│   │       ā”œā”€ā”€ discarded
│   │       ā”œā”€ā”€ visibility
│   │       ā”œā”€ā”€ is_locked
│   │       └── data_protection_level
│   │
│   ā”œā”€ā”€ memories/                         # SECONDARY - Extracted facts
│   │   └── {memory_id}/
│   │       ā”œā”€ā”€ id
│   │       ā”œā”€ā”€ uid
│   │       ā”œā”€ā”€ conversation_id
│   │       ā”œā”€ā”€ content
│   │       ā”œā”€ā”€ category
│   │       ā”œā”€ā”€ tags
│   │       ā”œā”€ā”€ visibility
│   │       ā”œā”€ā”€ created_at
│   │       ā”œā”€ā”€ updated_at
│   │       ā”œā”€ā”€ reviewed
│   │       ā”œā”€ā”€ user_review
│   │       ā”œā”€ā”€ scoring
│   │       └── data_protection_level
│   │
│   └── action_items/                     # Standalone action items
│       └── {action_item_id}/
│           ā”œā”€ā”€ description
│           ā”œā”€ā”€ completed
│           ā”œā”€ā”€ conversation_id
│           ā”œā”€ā”€ created_at
│           ā”œā”€ā”€ due_at
│           └── completed_at

Part 1: Storing Conversations

Processing Flow

<Steps> <Step title="API Request" icon="paper-plane"> The app sends a POST request to `/v1/conversations` with transcript data </Step> <Step title="Processing" icon="gear"> `process_conversation()` in `utils/conversations/process_conversation.py` handles the logic </Step> <Step title="Structure Extraction" icon="wand-magic-sparkles"> LLM extracts title, overview, action items, and events from the transcript </Step> <Step title="Storage" icon="database"> `upsert_conversation()` in `database/conversations.py` saves to Firestore </Step> <Step title="Vector Embedding" icon="magnifying-glass"> Conversation is embedded and stored in Pinecone for semantic search </Step> </Steps>

Conversation Model Fields

FieldTypeDescription
idstringUnique conversation identifier
created_atdatetimeWhen the conversation record was created
started_atdatetimeWhen the actual conversation started
finished_atdatetimeWhen the conversation ended
sourceenumSource device (omi, phone, desktop, openglass, etc.)
languagestringLanguage code of the conversation
statusenumProcessing status: in_progress, processing, completed, failed
structuredobjectExtracted structured information (see below)
transcript_segmentsarrayList of transcript segments
geolocationobjectLocation data (latitude, longitude, address)
photosarrayPhotos captured during conversation
audio_filesarrayAudio file references
apps_resultsarrayResults from summarization apps
external_dataobjectData from external integrations
discardedbooleanWhether conversation was marked as low-quality
visibilityenumprivate, shared, or public
is_lockedbooleanWhether conversation is locked from editing
data_protection_levelstringstandard or enhanced (encrypted)

Structured Information

The structured field contains LLM-extracted information:

FieldTypeDescription
titlestringShort descriptive title for the conversation
overviewstringSummary of key points discussed
emojistringEmoji representing the conversation
categoryenumCategory (personal, work, health, etc.)
action_itemsarrayTasks or to-dos mentioned
eventsarrayCalendar events to be created

Transcript Segments

Each segment in transcript_segments includes:

FieldTypeDescription
textstringTranscribed text content
speakerstringSpeaker label (e.g., "SPEAKER_00")
startfloatStart time in seconds
endfloatEnd time in seconds
is_userbooleanWhether spoken by the device owner
person_idstringID of identified person (if matched)

Action Items

Action items are stored both inline (in structured.action_items) and in a standalone collection:

FieldTypeDescription
descriptionstringThe action item text
completedbooleanWhether the item is done
created_atdatetimeWhen extracted
due_atdatetimeOptional due date
completed_atdatetimeWhen marked complete
conversation_idstringSource conversation

Events

Calendar events extracted from conversations:

FieldTypeDescription
titlestringEvent title
descriptionstringEvent description
startdatetimeStart date/time
durationintegerDuration in minutes
createdbooleanWhether added to calendar

Part 2: Extracting & Storing Memories

Memories are facts about the user extracted from conversations. They represent learnings, preferences, habits, and other personal information.

Memory Extraction Process

During process_conversation(), the system:

<Steps> <Step title="Analyze Transcript" icon="magnifying-glass-chart"> Reviews the conversation transcript for personal information </Step> <Step title="Extract Facts" icon="lightbulb"> Identifies facts worth remembering about the user (~15 words max) </Step> <Step title="Store with Link" icon="link"> Saves to `memories` collection with a link back to the source conversation </Step> </Steps>

Memory Model Fields

FieldTypeDescription
idstringUnique memory identifier
uidstringUser ID
conversation_idstringSource conversation (links back)
contentstringThe actual fact/learning (max ~15 words)
categoryenuminteresting, system, or manual
tagsarrayCategorization tags
visibilitystringprivate or public
created_atdatetimeWhen memory was created
updated_atdatetimeLast modification time
reviewedbooleanWhether user has reviewed
user_reviewbooleanUser's approval (true/false/null)
editedbooleanWhether user edited the content
scoringstringRanking score for retrieval
manually_addedbooleanWhether user created manually
is_lockedbooleanPrevent automatic deletion
app_idstringSource app (if from integration)
data_protection_levelstringEncryption level

Memory Categories

<CardGroup cols={3}> <Card title="Interesting" icon="star"> Notable facts about the user: hobbies, opinions, stories </Card> <Card title="System" icon="gear"> Preferences and patterns: work habits, sleep schedule </Card> <Card title="Manual" icon="pen"> User-created memories: explicitly added facts </Card> </CardGroup> <Note> Legacy categories (`core`, `hobbies`, `lifestyle`, `interests`, `habits`, `work`, `skills`, `learnings`, `other`) are automatically mapped to the new primary categories for backward compatibility. </Note>

Memory Extraction Rules

The system follows these guidelines when extracting memories:

  • Maximum ~15 words per memory
  • Must pass the "shareability test" - would this be worth telling someone?
  • Maximum 2 interesting + 2 system memories per conversation
  • No duplicate or near-duplicate facts
  • Skip mundane details (eating, sleeping, commuting)

Part 3: Data Protection & Encryption

Both conversations and memories support encryption for sensitive data.

<Tabs> <Tab title="Standard" icon="unlock"> ### Standard Protection Level
No encryption, stored as plaintext. This is the default for most users.

- Fastest read/write performance
- Data visible in Firestore console
- Suitable for general use
</Tab> <Tab title="Enhanced" icon="lock"> ### Enhanced Protection Level
AES encryption for sensitive fields. Provides additional security for sensitive conversations.

**Encrypted Fields:**
- **Conversations**: `transcript_segments` (the actual transcript text)
- **Memories**: `content` (the memory text)

<Warning>
Enhanced encryption adds processing overhead to read/write operations.
</Warning>
</Tab> </Tabs>

Implementation

python
# Conversations: database/conversations.py
def _prepare_conversation_for_write(conversation_data, data_protection_level):
    if data_protection_level == 'enhanced':
        # Encrypt transcript_segments before storage
        ...

def _prepare_conversation_for_read(conversation_data, data_protection_level):
    if data_protection_level == 'enhanced':
        # Decrypt transcript_segments after retrieval
        ...

Part 4: Vector Embeddings

Conversations are also stored as vector embeddings in Pinecone for semantic search.

What Gets Embedded

DataEmbedded?Stored in Metadata?
TitleYesNo
OverviewYesNo
Action ItemsYesNo
Full TranscriptNo (too large)No
People MentionedNoYes
TopicsNoYes
EntitiesNoYes
created_atNoYes

Vector Creation

Vectors are created in a background thread after conversation processing:

python
# utils/conversations/process_conversation.py
threading.Thread(
    target=save_structured_vector,
    args=(uid, conversation)
).start()

The save_structured_vector() function:

  1. Generates embedding from conversation.structured (title + overview + action_items + events)
  2. Extracts metadata via LLM (people, topics, entities, dates)
  3. Upserts to Pinecone with metadata filters
<Warning> Vectors are created ONCE during initial processing. Reprocessed conversations do NOT update their vectors. </Warning>

Key Code Locations

ComponentFile Path
Conversation Modelbackend/models/conversation.py
Memory Modelbackend/models/memories.py
Process Conversationbackend/utils/conversations/process_conversation.py
Database - Conversationsbackend/database/conversations.py
Database - Memoriesbackend/database/memories.py
Router - Conversationsbackend/routers/conversations.py
Router - Memoriesbackend/routers/memories.py
Vector Databasebackend/database/vector_db.py

API Endpoints

Conversations

MethodEndpointDescription
POST/v1/conversationsProcess and store a new conversation
GET/v1/conversationsList user's conversations
GET/v1/conversations/{id}Get specific conversation
PATCH/v1/conversations/{id}/titleUpdate conversation title
DELETE/v1/conversations/{id}Delete a conversation

Memories

MethodEndpointDescription
POST/v3/memoriesCreate a manual memory
GET/v3/memoriesList user's memories
PATCH/v3/memories/{id}Edit a memory
DELETE/v3/memories/{id}Delete a memory
PATCH/v3/memories/{id}/visibilityChange memory visibility

<CardGroup cols={2}> <Card title="Chat System Architecture" icon="comments" href="/doc/developer/backend/chat_system"> How conversations are retrieved for chat using LangGraph </Card> <Card title="Real-time Transcription" icon="microphone" href="/doc/developer/backend/transcription"> WebSocket-based real-time speech-to-text </Card> <Card title="Backend Deep Dive" icon="server" href="/doc/developer/backend/backend_deepdive"> General backend architecture overview </Card> <Card title="Backend Setup" icon="gear" href="/doc/developer/backend/Backend_Setup"> Environment setup and configuration </Card> </CardGroup>