docs/doc/developer/apps/Integrations.mdx
Integration apps allow Omi to interact with external services by sending data to your webhook endpoints. Unlike prompt-based apps, these require you to host a server.
<CardGroup cols={3}> <Card title="Memory Triggers" icon="bell"> Run code when a memory is created </Card> <Card title="Real-Time Transcript" icon="bolt"> Process live transcripts as they happen </Card> <Card title="Audio Streaming" icon="microphone"> Receive raw audio bytes for custom processing </Card> </CardGroup>flowchart LR
subgraph Omi["Omi Backend"]
M[Memory Created]
T[Live Transcript]
A[Audio Stream]
end
subgraph Your["Your Server"]
W[Webhook Endpoint]
P[Process Data]
E[External Services]
end
M -->|POST| W
T -->|POST| W
A -->|POST| W
W --> P
P --> E
These apps are activated when Omi creates a new memory, allowing you to process or store the data externally.
<AccordionGroup> <Accordion title="How It Works" icon="gear"> 1. User completes a conversation 2. Omi processes and creates a memory 3. Your webhook receives the complete memory object 4. Your server processes and respondsThe webhook receives the full conversation data including transcript, structured summary, action items, and metadata.
**Running FastAPI locally (no cloud deployment):**
<iframe
width="560"
height="315"
src="https://www.youtube.com/embed/bMU6fTLysRY?si=3cvXEsWAUwKEnjHn"
title="Running FastAPI Locally"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
></iframe>
Your endpoint receives a POST request with the memory object:
POST /your-endpoint?uid=user123
{
"id": "memory_abc123",
"created_at": "2024-07-22T23:59:45.910559+00:00",
"started_at": "2024-07-21T22:34:43.384323+00:00",
"finished_at": "2024-07-21T22:35:43.384323+00:00",
"transcript_segments": [
{
"text": "Let's discuss the project timeline.",
"speaker": "SPEAKER_00",
"speakerId": 0,
"speaker_name": "John",
"is_user": false,
"start": 10.0,
"end": 15.0
}
],
"structured": {
"title": "Project Timeline Discussion",
"overview": "Brief overview of the conversation...",
"emoji": "📅",
"category": "work",
"action_items": [
{
"description": "Send project proposal by Friday",
"completed": false
}
],
"events": []
},
"apps_response": [],
"discarded": false
}
Process conversation transcripts as they occur, enabling real-time analysis and actions.
<AccordionGroup> <Accordion title="How It Works" icon="gear"> 1. User starts speaking 2. Omi transcribes in real-time 3. Your webhook receives transcript segments as they're created 4. Your server processes and can trigger immediate actionsSegments arrive in multiple calls as the conversation unfolds, allowing for live reactions.
Your endpoint receives transcript segments with session context:
POST /your-endpoint?session_id=abc123&uid=user123
[
{
"text": "I think we should prioritize the mobile app.",
"speaker": "SPEAKER_00",
"speakerId": 0,
"is_user": false,
"start": 10.0,
"end": 15.0
},
{
"text": "Agreed, let's start with iOS.",
"speaker": "SPEAKER_01",
"speakerId": 1,
"is_user": true,
"start": 16.0,
"end": 18.0
}
]
Stream raw audio bytes from Omi directly to your endpoint for custom audio processing.
<AccordionGroup> <Accordion title="How It Works" icon="gear"> 1. User speaks into Omi device 2. Raw PCM audio is streamed to your endpoint 3. Your server processes the audio bytes directly 4. Handle as needed (custom STT, VAD, feature extraction, etc.)Unlike transcript processors, you receive the actual audio data, not text.
| Setting | Value |
|---|---|
| Trigger Type | audio_bytes |
| HTTP Method | POST |
| Content-Type | application/octet-stream |
| Audio Format | PCM16 (16-bit little-endian) |
| Bytes per Sample | 2 |
Request format:
POST /your-endpoint?sample_rate=16000&uid=user123
Body contains raw PCM16 audio bytes.
<Note> To produce a playable WAV file, prepend a WAV header and concatenate the received chunks. </Note>You can control how often audio is sent via the Omi app Developer Settings:
url,seconds
For example: https://your-endpoint.com/audio,5 sends audio every 5 seconds.
Your endpoint should:
- Accept POST requests
- Parse JSON body (or binary for audio)
- Read `uid` from query parameters
- Return 200 OK quickly
**Example (Python/FastAPI):**
```python
from fastapi import FastAPI, Request
app = FastAPI()
@app.post("/webhook")
async def handle_memory(request: Request, uid: str):
memory = await request.json()
# Process memory data
await send_to_slack(memory["structured"]["title"])
return {"status": "ok"}
```
When submitting your integration app:
| Field | Required | Description |
|---|---|---|
| Webhook URL | Yes | Your POST endpoint for receiving data |
| Setup Completed URL | No | GET endpoint returning {"is_setup_completed": boolean} |
| Auth URL | No | URL for user authentication (uid appended automatically) |
| Setup Instructions | No | Text or link explaining how to configure your app |