packages/kilo-docs/pages/ai-providers/openai-compatible.md
Kilo Code supports a wide range of AI model providers that offer APIs compatible with the OpenAI API standard. This means you can use models from providers other than OpenAI, while still using a familiar API interface. This includes providers like:
This document focuses on setting up providers other than the official OpenAI API (which has its own dedicated configuration page).
{% tabs %} {% tab label="VSCode (Legacy)" %}
The key to using an OpenAI-compatible provider is to configure two main settings:
https://api.openai.com/v1 (that's for the official OpenAI API).You'll find these settings in the Kilo Code settings panel (click the {% codicon name="gear" /%} icon):
{% /tab %} {% tab label="VSCode" %}
my-provider).https://api.your-provider.com/v1). Kilo auto-fetches available models when a valid URL is entered.For additional model configuration (token limits, tool calling, variants), edit the kilo.jsonc config file directly — see the CLI tab or the Custom Models guide.
When configuring a custom OpenAI-compatible provider, Kilo Code can automatically detect available models from your provider's /v1/models endpoint.
Once you enter a valid Base URL and API Key, Kilo Code will query the provider and present a searchable model picker with all available models. You can:
This eliminates the need to manually look up and type model IDs. If auto-detection fails (for example, if the provider doesn't support the /v1/models endpoint), you can still enter model IDs manually.
{% /tab %} {% tab label="CLI" %}
Define a custom provider in your kilo.json config file (~/.config/kilo/kilo.json or ./kilo.json). The provider key (e.g., "vllm") is your chosen identifier — it can be any name you like.
You must define at least one model. Setting name and limit (context window and max output tokens) is recommended so the agent can manage context correctly:
{
"provider": {
"vllm": {
"models": {
"qwen35": {
"name": "Qwen 3.5",
"limit": {
"context": 262144,
"output": 16384,
},
},
},
"options": {
"apiKey": "none",
"baseURL": "http://my.url:8000/v1",
},
},
},
}
Then set your default model using the provider-id/model-id format:
{
"model": "vllm/qwen35",
}
Configuration fields:
models — A map of model IDs to model definitions. Each model should include a name and limit with context and output token counts. If limit.context or limit.output is omitted, it defaults to 0, which limits context management.options.baseURL — The base URL of your OpenAI-compatible API endpoint.options.apiKey — Your API key. Use any non-empty string (e.g., "none") if the provider doesn't require authentication.You can also set the API key via an environment variable instead of putting it in the config file. Use the env field to specify which variable to read:
{
"provider": {
"my-provider": {
"env": ["MY_PROVIDER_API_KEY"],
"models": {
"my-model": {
"name": "My Model",
"limit": { "context": 128000, "output": 4096 },
},
},
"options": {
"baseURL": "https://api.my-provider.com/v1",
},
},
},
}
{% /tab %} {% /tabs %}
Kilo Code supports full endpoint URLs in the Base URL field, providing greater flexibility for provider configuration:
Standard Base URL Format:
https://api.provider.com/v1
Full Endpoint URL Format:
https://api.provider.com/v1/chat/completions
https://custom-endpoint.provider.com/api/v2/models/chat
This enhancement allows you to:
Note: When using full endpoint URLs, ensure the URL points to the correct chat completions endpoint for your provider.
By using an OpenAI-compatible provider, you can leverage the flexibility of Kilo Code with a wider range of AI models. Remember to always consult your provider's documentation for the most accurate and up-to-date information.