docs/gateway/openai-http-api.md
OpenClaw's Gateway can serve a small OpenAI-compatible Chat Completions endpoint.
This endpoint is disabled by default. Enable it in config first.
POST /v1/chat/completionshttp://<gateway-host>:<port>/v1/chat/completionsWhen the Gateway's OpenAI-compatible HTTP surface is enabled, it also serves:
GET /v1/modelsGET /v1/models/{id}POST /v1/embeddingsPOST /v1/responsesUnder the hood, requests are executed as a normal Gateway agent run (same codepath as openclaw agent), so routing/permissions/config match your Gateway.
Uses the Gateway auth configuration.
Common HTTP auth paths:
gateway.auth.mode="token" or "password"):
Authorization: Bearer <token-or-password>gateway.auth.mode="trusted-proxy"):
route through the configured identity-aware proxy and let it inject the
required identity headersgateway.auth.mode="none"):
no auth header requiredNotes:
gateway.auth.mode="token", use gateway.auth.token (or OPENCLAW_GATEWAY_TOKEN).gateway.auth.mode="password", use gateway.auth.password (or OPENCLAW_GATEWAY_PASSWORD).gateway.auth.mode="trusted-proxy", the HTTP request must come from a
configured trusted proxy source; same-host loopback proxies require explicit
gateway.auth.trustedProxy.allowLoopback = true.gateway.auth.password / OPENCLAW_GATEWAY_PASSWORD as a local direct
fallback. Any Forwarded, X-Forwarded-*, or X-Real-IP header evidence
keeps the request on the trusted-proxy path instead.gateway.auth.rateLimit is configured and too many auth failures occur, the endpoint returns 429 with Retry-After.Treat this endpoint as a full operator-access surface for the gateway instance.
token and password), the endpoint restores the normal full operator defaults even if the caller sends a narrower x-openclaw-scopes header.gateway.auth.mode="none") honor x-openclaw-scopes when present and otherwise fall back to the normal operator default scope set.Auth matrix:
gateway.auth.mode="token" or "password" + Authorization: Bearer ...
x-openclaw-scopesoperator.admin, operator.approvals, operator.pairing,
operator.read, operator.talk.secrets, operator.writegateway.auth.mode="none" on private ingress)
x-openclaw-scopes when the header is presentoperator.adminSee Security and Remote access.
Use /v1/chat/completions when you are integrating tooling or a trusted app-side backend with an existing gateway and can safely hold gateway operator credentials.
OpenClaw treats the OpenAI model field as an agent target, not a raw provider model id.
model: "openclaw" routes to the configured default agent.model: "openclaw/default" also routes to the configured default agent.model: "openclaw/<agentId>" routes to a specific agent.Optional request headers:
x-openclaw-model: <provider/model-or-bare-id> overrides the backend model for the selected agent.x-openclaw-agent-id: <agentId> remains supported as a compatibility override.x-openclaw-session-key: <sessionKey> fully controls session routing.x-openclaw-message-channel: <channel> sets the synthetic ingress channel context for channel-aware prompts and policies.Compatibility aliases still accepted:
model: "openclaw:<agentId>"model: "agent:<agentId>"Set gateway.http.endpoints.chatCompletions.enabled to true:
{
gateway: {
http: {
endpoints: {
chatCompletions: { enabled: true },
},
},
},
}
Set gateway.http.endpoints.chatCompletions.enabled to false:
{
gateway: {
http: {
endpoints: {
chatCompletions: { enabled: false },
},
},
},
}
By default the endpoint is stateless per request (a new session key is generated each call).
If the request includes an OpenAI user string, the Gateway derives a stable session key from it, so repeated calls can share an agent session.
For custom apps, the safest default is to reuse the same user value per conversation thread. Avoid account-level identifiers unless you explicitly want multiple conversations or devices to share one OpenClaw session. Use x-openclaw-session-key when you need explicit routing control across multiple clients or threads.
This is the highest-leverage compatibility set for self-hosted frontends and tooling:
/v1/models./v1/embeddings./v1/chat/completions./v1/responses.The returned ids are `openclaw`, `openclaw/default`, and `openclaw/<agentId>` entries.
Use them directly as OpenAI `model` values.
Sub-agents remain internal execution topology. They do not appear as pseudo-models.
That means clients can keep using one predictable id even if the real default agent id changes between environments.
Examples:
`x-openclaw-model: openai/gpt-5.4`
`x-openclaw-model: gpt-5.5`
If you omit it, the selected agent runs with its normal configured model choice.
Use `model: "openclaw/default"` or `model: "openclaw/<agentId>"`.
When you need a specific embedding model, send it in `x-openclaw-model`.
Without that header, the request passes through to the selected agent's normal embedding setup.
Set stream: true to receive Server-Sent Events (SSE):
Content-Type: text/event-streamdata: <json>data: [DONE]/v1/chat/completions supports a function-tool subset compatible with common OpenAI Chat clients.
tools: array of { "type": "function", "function": { ... } }tool_choice: "auto", "none"messages[*].role: "tool" follow-up turnsmessages[*].tool_call_id for binding tool results back to a prior tool callmax_completion_tokens: number; per-call cap for total completion tokens (reasoning tokens included). Current OpenAI Chat Completions field name; preferred when both max_completion_tokens and max_tokens are sent.max_tokens: number; legacy alias accepted for backwards compatibility. Ignored when max_completion_tokens is also present.temperature: number; best-effort sampling temperature forwarded to the upstream provider via the agent stream-param channel.top_p: number; best-effort nucleus sampling forwarded to the upstream provider via the agent stream-param channel.When either token-cap field is set, the value is forwarded to the upstream provider via the agent stream-param channel. The actual wire field name sent to the upstream provider is chosen by the provider transport: max_completion_tokens for OpenAI-family endpoints, and max_tokens for providers that only accept the legacy name (such as Mistral and Chutes). Sampling fields (temperature, top_p) follow the same stream-param channel; the ChatGPT-based Codex Responses backend strips them server-side since it uses fixed sampling.
The endpoint returns 400 invalid_request_error for unsupported tool variants, including:
toolstool.function.nametool_choice variants such as allowed_tools and customtool_choice: "required" (not yet enforced at runtime; will be supported once hard enforcement is implemented)tool_choice: { "type": "function", "function": { "name": "..." } } (same rationale as required)tool_choice.function.name values that do not match provided toolsWhen the agent decides to call tools, the response uses:
choices[0].finish_reason = "tool_calls"choices[0].message.tool_calls[] entries with:
idtype: "function"function.namefunction.arguments (JSON string)Assistant commentary before the tool call is returned in choices[0].message.content (possibly empty).
When stream: true, tool calls are emitted as incremental SSE chunks:
delta.tool_calls chunks carrying tool identity and argument fragmentsfinish_reason: "tool_calls"data: [DONE]If stream_options.include_usage=true, a trailing usage chunk is emitted before [DONE].
After receiving tool_calls, the client should execute the requested function(s) and send a follow-up request that includes:
role: "tool" messages with matching tool_call_idThis allows the gateway agent run to continue the same reasoning loop and produce the final assistant answer.
For a basic Open WebUI connection:
http://127.0.0.1:18789/v1http://host.docker.internal:18789/v1openclaw/defaultExpected behavior:
GET /v1/models should list openclaw/defaultopenclaw/default as the chat model idx-openclaw-modelQuick smoke:
curl -sS http://127.0.0.1:18789/v1/models \
-H 'Authorization: Bearer YOUR_TOKEN'
If that returns openclaw/default, most Open WebUI setups can connect with the same base URL and token.
Stable session for one app conversation:
curl -sS http://127.0.0.1:18789/v1/chat/completions \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{
"model": "openclaw/default",
"user": "conv:YOUR_CONVERSATION_ID",
"messages": [{"role":"user","content":"Summarize my tasks for today"}]
}'
Reuse the same user value on later calls for that conversation to continue the same agent session.
Non-streaming:
curl -sS http://127.0.0.1:18789/v1/chat/completions \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{
"model": "openclaw/default",
"messages": [{"role":"user","content":"hi"}]
}'
Streaming:
curl -N http://127.0.0.1:18789/v1/chat/completions \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-H 'x-openclaw-model: openai/gpt-5.4' \
-d '{
"model": "openclaw/research",
"stream": true,
"messages": [{"role":"user","content":"hi"}]
}'
List models:
curl -sS http://127.0.0.1:18789/v1/models \
-H 'Authorization: Bearer YOUR_TOKEN'
Fetch one model:
curl -sS http://127.0.0.1:18789/v1/models/openclaw%2Fdefault \
-H 'Authorization: Bearer YOUR_TOKEN'
Create embeddings:
curl -sS http://127.0.0.1:18789/v1/embeddings \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-H 'x-openclaw-model: openai/text-embedding-3-small' \
-d '{
"model": "openclaw/default",
"input": ["alpha", "beta"]
}'
Notes:
/v1/models returns OpenClaw agent targets, not raw provider catalogs.openclaw/default is always present so one stable id works across environments.x-openclaw-model, not the OpenAI model field./v1/embeddings supports input as a string or array of strings.