docs/book/src/providers/custom.md
Three ways to add a provider ZeroClaw doesn't ship with:
openai-compatible at the endpoint. Works for ~80% of cases.llamacpp, sglang, vllm). Thin wrappers with sensible defaults.Provider trait in Rust. For anything that's not OpenAI-compatible.If the service speaks OpenAI chat-completions, this is a config-only change:
[providers.models.my-endpoint]
kind = "openai-compatible"
base_url = "https://my-gateway.example.com"
model = "my-model-id"
api_key = "..." # omit if the endpoint needs no auth
Then reference it:
default_model = "my-endpoint"
This is the same implementation used for Groq, Mistral, xAI, and every other OpenAI-compat provider in the catalog.
ZeroClaw ships tight adapters for three popular local-inference stacks. They're openai-compatible under the hood but with defaults and quality-of-life tuning pre-applied.
llama-server -hf ggml-org/gpt-oss-20b-GGUF --jinja -c 133000 --host 127.0.0.1 --port 8033
[providers.models.llama]
kind = "llamacpp" # alias: "llama.cpp"
base_url = "http://127.0.0.1:8033/v1" # omit to use default http://localhost:8080/v1
model = "ggml-org/gpt-oss-20b-GGUF"
# api_key only required if llama-server was started with --api-key
python -m sglang.launch_server --model meta-llama/Llama-3.1-8B-Instruct --port 30000
[providers.models.sglang]
kind = "sglang"
base_url = "http://localhost:30000/v1" # default
model = "meta-llama/Llama-3.1-8B-Instruct"
vllm serve meta-llama/Llama-3.1-8B-Instruct
[providers.models.vllm]
kind = "vllm"
base_url = "http://localhost:8000/v1" # default
model = "meta-llama/Llama-3.1-8B-Instruct"
Regardless of which approach:
zeroclaw models refresh --provider <name> # list models the endpoint advertises
zeroclaw agent -m "hello" # smoke-test with a one-shot message
Provider traitIf the endpoint isn't OpenAI-compatible and isn't one of the first-class local adapters, you need code.
The trait lives in crates/zeroclaw-api/src/provider.rs:
#[async_trait]
pub trait Provider: Send + Sync {
fn name(&self) -> &str;
fn supports_streaming(&self) -> bool { true }
fn supports_streaming_tool_events(&self) -> bool { false }
async fn chat(
&self,
messages: Vec<Message>,
tools: Vec<ToolSchema>,
options: ChatOptions,
) -> Pin<Box<dyn Stream<Item = Result<StreamEvent>> + Send>>;
}
Implementation pattern:
crates/zeroclaw-providers/src/myprovider.rsProvider — translate Vec<Message> to the wire format, stream the response, emit StreamEvent valueslib.rs:
factory.register("myprovider", |cfg| MyProvider::new(cfg).boxed());
Cargo.toml if the provider pulls heavy deps[providers.models.<name>] kind = "myprovider" parser in the config schemaSee anthropic.rs as a reference for a provider with a fully custom wire format. See compatible.rs for the SSE-streaming OpenAI-compat pattern.
sk-, gsk_, sk-ant-)base_url includes the scheme (http:// / https://) and the /v1 path if the endpoint expects itcurl -sS "$BASE_URL/models" -H "Authorization: Bearer $API_KEY" | jq
/models, send a direct chat request and read the error — most endpoints return the expected model family in the error bodycurl -I $BASE_URL — does it respond?[providers.*] schema