docs/content/Agents/openai-compatible.mdx
import { Callout, Tabs } from 'nextra/components';
DocsGPT exposes /v1/chat/completions following the standard chat completions protocol. Point any compatible client — opencode, Aider, LibreChat or the OpenAI SDKs — at your DocsGPT Agent by changing only the base URL and API key.
<Tabs items={['Python', 'cURL']}> <Tabs.Tab> ```python from openai import OpenAI
client = OpenAI(
base_url="http://localhost:7091/v1", # or https://gptcloud.arc53.com/v1
api_key="your_agent_api_key",
)
response = client.chat.completions.create(
model="docsgpt-agent",
messages=[{"role": "user", "content": "Summarize our refund policy"}],
)
print(response.choices[0].message.content)
```
</Tabs.Tab>
<Tabs.Tab>
bash curl -X POST http://localhost:7091/v1/chat/completions \ -H "Authorization: Bearer your_agent_api_key" \ -H "Content-Type: application/json" \ -d '{"model":"docsgpt-agent","messages":[{"role":"user","content":"Summarize our refund policy"}]}'
</Tabs.Tab>
</Tabs>
The model field is accepted but ignored — the agent bound to your API key determines the model. The agent's prompt, sources, tools, and default model are loaded automatically.
| Environment | Base URL |
|---|---|
| Local | http://localhost:7091/v1 |
| Cloud | https://gptcloud.arc53.com/v1 |
Authenticate with Authorization: Bearer <agent_api_key>.
| Method | Path | Description |
|---|---|---|
POST | /v1/chat/completions | Chat request (streaming or non-streaming) |
GET | /v1/models | List agents available to your key |
Set "stream": true. You'll receive SSE chunks with choices[0].delta.content. DocsGPT-specific events (sources, tool calls) arrive as extra frames with a docsgpt key — standard clients ignore them.
stream = client.chat.completions.create(
model="docsgpt-agent",
stream=True,
messages=[{"role": "user", "content": "Explain vector search"}],
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
System messages are dropped by default — the agent's configured prompt is used. To allow callers to override it, enable Allow prompt override in the agent's Advanced settings.
<Callout type="warning"> When an override is active, the agent's prompt template is replaced wholesale — template variables like `{summaries}` are not substituted. </Callout>Conversations are not persisted by default (stateless, like most OpenAI clients expect). Opt in per request:
{ "docsgpt": { "save_conversation": true } }
The response will include docsgpt.conversation_id.
Use /api/answer or /stream if you need server-side attachments, passthrough template variables, explicit conversation_id reuse, or persistence by default.