docs/gateway/openai-http-api.md
OpenClaw's Gateway can serve a small OpenAI-compatible Chat Completions endpoint.
This endpoint is disabled by default. Enable it in config first.
POST /v1/chat/completionshttp://<gateway-host>:<port>/v1/chat/completionsWhen the Gateway's OpenAI-compatible HTTP surface is enabled, it also serves:
GET /v1/modelsGET /v1/models/{id}POST /v1/embeddingsPOST /v1/responsesUnder the hood, requests are executed as a normal Gateway agent run (same codepath as openclaw agent), so routing/permissions/config match your Gateway.
Uses the Gateway auth configuration.
Common HTTP auth paths:
gateway.auth.mode="token" or "password"):
Authorization: Bearer <token-or-password>gateway.auth.mode="trusted-proxy"):
route through the configured identity-aware proxy and let it inject the
required identity headersgateway.auth.mode="none"):
no auth header requiredNotes:
gateway.auth.mode="token", use gateway.auth.token (or OPENCLAW_GATEWAY_TOKEN).gateway.auth.mode="password", use gateway.auth.password (or OPENCLAW_GATEWAY_PASSWORD).gateway.auth.mode="trusted-proxy", the HTTP request must come from a
configured trusted proxy source; same-host loopback proxies require explicit
gateway.auth.trustedProxy.allowLoopback = true.gateway.auth.rateLimit is configured and too many auth failures occur, the endpoint returns 429 with Retry-After.Treat this endpoint as a full operator-access surface for the gateway instance.
token and password), the endpoint restores the normal full operator defaults even if the caller sends a narrower x-openclaw-scopes header.gateway.auth.mode="none") honor x-openclaw-scopes when present and otherwise fall back to the normal operator default scope set.Auth matrix:
gateway.auth.mode="token" or "password" + Authorization: Bearer ...
x-openclaw-scopesoperator.admin, operator.approvals, operator.pairing,
operator.read, operator.talk.secrets, operator.writegateway.auth.mode="none" on private ingress)
x-openclaw-scopes when the header is presentoperator.adminSee Security and Remote access.
OpenClaw treats the OpenAI model field as an agent target, not a raw provider model id.
model: "openclaw" routes to the configured default agent.model: "openclaw/default" also routes to the configured default agent.model: "openclaw/<agentId>" routes to a specific agent.Optional request headers:
x-openclaw-model: <provider/model-or-bare-id> overrides the backend model for the selected agent.x-openclaw-agent-id: <agentId> remains supported as a compatibility override.x-openclaw-session-key: <sessionKey> fully controls session routing.x-openclaw-message-channel: <channel> sets the synthetic ingress channel context for channel-aware prompts and policies.Compatibility aliases still accepted:
model: "openclaw:<agentId>"model: "agent:<agentId>"Set gateway.http.endpoints.chatCompletions.enabled to true:
{
gateway: {
http: {
endpoints: {
chatCompletions: { enabled: true },
},
},
},
}
Set gateway.http.endpoints.chatCompletions.enabled to false:
{
gateway: {
http: {
endpoints: {
chatCompletions: { enabled: false },
},
},
},
}
By default the endpoint is stateless per request (a new session key is generated each call).
If the request includes an OpenAI user string, the Gateway derives a stable session key from it, so repeated calls can share an agent session.
This is the highest-leverage compatibility set for self-hosted frontends and tooling:
/v1/models./v1/embeddings./v1/chat/completions./v1/responses.The returned ids are `openclaw`, `openclaw/default`, and `openclaw/<agentId>` entries.
Use them directly as OpenAI `model` values.
Sub-agents remain internal execution topology. They do not appear as pseudo-models.
That means clients can keep using one predictable id even if the real default agent id changes between environments.
Examples:
`x-openclaw-model: openai/gpt-5.4`
`x-openclaw-model: gpt-5.5`
If you omit it, the selected agent runs with its normal configured model choice.
Use `model: "openclaw/default"` or `model: "openclaw/<agentId>"`.
When you need a specific embedding model, send it in `x-openclaw-model`.
Without that header, the request passes through to the selected agent's normal embedding setup.
Set stream: true to receive Server-Sent Events (SSE):
Content-Type: text/event-streamdata: <json>data: [DONE]/v1/chat/completions supports a function-tool subset compatible with common OpenAI Chat clients.
tools: array of { "type": "function", "function": { ... } }tool_choice: "auto", "none"messages[*].role: "tool" follow-up turnsmessages[*].tool_call_id for binding tool results back to a prior tool callmax_completion_tokens: number; per-call cap for total completion tokens (reasoning tokens included). Current OpenAI Chat Completions field name; preferred when both max_completion_tokens and max_tokens are sent.max_tokens: number; legacy alias accepted for backwards compatibility. Ignored when max_completion_tokens is also present.When either field is set, the value is forwarded to the upstream provider via the agent stream-param channel. The actual wire field name sent to the upstream provider is chosen by the provider transport: max_completion_tokens for OpenAI-family endpoints, and max_tokens for providers that only accept the legacy name (such as Mistral and Chutes).
The endpoint returns 400 invalid_request_error for unsupported tool variants, including:
toolstool.function.nametool_choice variants such as allowed_tools and customtool_choice: "required" (not yet enforced at runtime; will be supported once hard enforcement is implemented)tool_choice: { "type": "function", "function": { "name": "..." } } (same rationale as required)tool_choice.function.name values that do not match provided toolsWhen the agent decides to call tools, the response uses:
choices[0].finish_reason = "tool_calls"choices[0].message.tool_calls[] entries with:
idtype: "function"function.namefunction.arguments (JSON string)Assistant commentary before the tool call is returned in choices[0].message.content (possibly empty).
When stream: true, tool calls are emitted as incremental SSE chunks:
delta.tool_calls chunks carrying tool identity and argument fragmentsfinish_reason: "tool_calls"data: [DONE]If stream_options.include_usage=true, a trailing usage chunk is emitted before [DONE].
After receiving tool_calls, the client should execute the requested function(s) and send a follow-up request that includes:
role: "tool" messages with matching tool_call_idThis allows the gateway agent run to continue the same reasoning loop and produce the final assistant answer.
For a basic Open WebUI connection:
http://127.0.0.1:18789/v1http://host.docker.internal:18789/v1openclaw/defaultExpected behavior:
GET /v1/models should list openclaw/defaultopenclaw/default as the chat model idx-openclaw-modelQuick smoke:
curl -sS http://127.0.0.1:18789/v1/models \
-H 'Authorization: Bearer YOUR_TOKEN'
If that returns openclaw/default, most Open WebUI setups can connect with the same base URL and token.
Non-streaming:
curl -sS http://127.0.0.1:18789/v1/chat/completions \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{
"model": "openclaw/default",
"messages": [{"role":"user","content":"hi"}]
}'
Streaming:
curl -N http://127.0.0.1:18789/v1/chat/completions \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-H 'x-openclaw-model: openai/gpt-5.4' \
-d '{
"model": "openclaw/research",
"stream": true,
"messages": [{"role":"user","content":"hi"}]
}'
List models:
curl -sS http://127.0.0.1:18789/v1/models \
-H 'Authorization: Bearer YOUR_TOKEN'
Fetch one model:
curl -sS http://127.0.0.1:18789/v1/models/openclaw%2Fdefault \
-H 'Authorization: Bearer YOUR_TOKEN'
Create embeddings:
curl -sS http://127.0.0.1:18789/v1/embeddings \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-H 'x-openclaw-model: openai/text-embedding-3-small' \
-d '{
"model": "openclaw/default",
"input": ["alpha", "beta"]
}'
Notes:
/v1/models returns OpenClaw agent targets, not raw provider catalogs.openclaw/default is always present so one stable id works across environments.x-openclaw-model, not the OpenAI model field./v1/embeddings supports input as a string or array of strings.