Back to Picoclaw

๐Ÿ”Œ Providers & Model Configuration

docs/guides/providers.md

0.2.823.5 KB
Original Source

๐Ÿ”Œ Providers & Model Configuration

Back to README

Providers

[!NOTE] Voice transcription can use a configured multimodal model via voice.model_name. Groq Whisper remains available as a fallback when no voice model is configured.

ProviderPurposeGet API Key
geminiLLM (Gemini direct)aistudio.google.com
zhipuLLM (Zhipu direct)bigmodel.cn
zai-codingLLM (Z.AI Coding Plan)z.ai
volcengineLLM(Volcengine direct)volcengine.com
openrouterLLM (recommended, access to all models)openrouter.ai
anthropicLLM (Claude direct)console.anthropic.com
openaiLLM (GPT direct)platform.openai.com
veniceLLM (Venice AI direct)venice.ai
deepseekLLM (DeepSeek direct)platform.deepseek.com
qwenLLM (Qwen direct)dashscope.console.aliyun.com
groqLLM + Voice transcription (Whisper)console.groq.com
cerebrasLLM (Cerebras direct)cerebras.ai
vivgridLLM (Vivgrid direct)vivgrid.com
nvidiaLLM (NVIDIA NIM)build.nvidia.com
moonshotLLM (Kimi/Moonshot direct)platform.moonshot.cn
minimaxLLM (Minimax direct)platform.minimaxi.com
avianLLM (Avian direct)avian.io
mistralLLM (Mistral direct)console.mistral.ai
longcatLLM (Longcat direct)longcat.ai
modelscopeLLM (ModelScope direct)modelscope.cn
mimoLLM (Xiaomi MiMo direct)platform.xiaomimimo.com

Model Configuration (model_list)

What's New? PicoClaw now prefers explicit provider + native model configuration (for example "provider": "zhipu", "model": "glm-4.7"). The legacy single-field provider/model form remains supported for compatibility when provider is omitted.

For agent dispatch and light-model routing examples, see the Routing Guide.

This design also enables multi-agent support with flexible provider selection:

  • Different agents, different providers: Each agent can use its own LLM provider
  • Model fallbacks: Configure primary and fallback models for resilience
  • Load balancing: Distribute requests across multiple endpoints
  • Centralized configuration: Manage all providers in one place

๐Ÿ“‹ All Supported Vendors

Vendorprovider ValueDefault API BaseProtocolAPI Key
OpenAIopenaihttps://api.openai.com/v1OpenAIGet Key
Venice AIvenicehttps://api.venice.ai/api/v1OpenAIGet Key
Anthropicanthropichttps://api.anthropic.com/v1AnthropicGet Key
ๆ™บ่ฐฑ AI (GLM)zhipuhttps://open.bigmodel.cn/api/paas/v4OpenAIGet Key
Z.AI Coding Planopenaihttps://api.z.ai/api/coding/paas/v4OpenAIGet Key
DeepSeekdeepseekhttps://api.deepseek.com/v1OpenAIGet Key
Google Geminigeminihttps://generativelanguage.googleapis.com/v1betaGeminiGet Key
Groqgroqhttps://api.groq.com/openai/v1OpenAIGet Key
Moonshotmoonshothttps://api.moonshot.cn/v1OpenAIGet Key
้€šไน‰ๅƒ้—ฎ (Qwen)qwenhttps://dashscope.aliyuncs.com/compatible-mode/v1OpenAIGet Key
NVIDIAnvidiahttps://integrate.api.nvidia.com/v1OpenAIGet Key
Ollamaollamahttp://localhost:11434/v1OpenAILocal (no key needed)
LM Studiolmstudiohttp://localhost:1234/v1OpenAIOptional (local default: no key)
OpenRouteropenrouterhttps://openrouter.ai/api/v1OpenAIGet Key
LiteLLM Proxylitellmhttp://localhost:4000/v1OpenAIYour LiteLLM proxy key
VLLMvllmhttp://localhost:8000/v1OpenAILocal
Cerebrascerebrashttps://api.cerebras.ai/v1OpenAIGet Key
VolcEngine (Doubao)volcenginehttps://ark.cn-beijing.volces.com/api/v3OpenAIGet Key
็ฅž็ฎ—ไบ‘shengsuanyunhttps://router.shengsuanyun.com/api/v1OpenAI-
BytePlusbyteplushttps://ark.ap-southeast.bytepluses.com/api/v3OpenAIGet Key
Vivgridvivgridhttps://api.vivgrid.com/v1OpenAIGet Key
LongCatlongcathttps://api.longcat.chat/openaiOpenAIGet Key
ModelScope (้ญ”ๆญ)modelscopehttps://api-inference.modelscope.cn/v1OpenAIGet Token
Xiaomi MiMomimohttps://api.xiaomimimo.com/v1OpenAIGet Key
Azure OpenAIazurehttps://{resource}.openai.azure.comAzureGet Key
AntigravityantigravityGoogle CloudCustomOAuth only
GitHub Copilotgithub-copilotlocalhost:4321gRPC-

Basic Configuration

json
{
  "model_list": [
    {
      "model_name": "ark-code-latest",
      "provider": "volcengine",
      "model": "ark-code-latest",
      "api_keys": ["sk-your-api-key"]
    },
    {
      "model_name": "gpt-5.4",
      "provider": "openai",
      "model": "gpt-5.4",
      "api_keys": ["sk-your-openai-key"]
    },
    {
      "model_name": "claude-sonnet-4.6",
      "provider": "anthropic",
      "model": "claude-sonnet-4.6",
      "api_keys": ["sk-ant-your-key"]
    },
    {
      "model_name": "glm-4.7",
      "provider": "zhipu",
      "model": "glm-4.7",
      "api_keys": ["your-zhipu-key"]
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "gpt-5.4"
    }
  }
}

model_list Entry Fields

FieldTypeRequiredDescription
model_namestringYesUnique name used to reference this model in agent config
providerstringNoPreferred provider identifier. When present, PicoClaw sends model unchanged to that provider
modelstringYesNative model ID when provider is set. If provider is omitted, the legacy provider/model form is still supported
api_keysstring[]Yes*API key(s) for authentication. Multiple keys enable per-request rotation. Not required for local providers (Ollama, LM Studio, VLLM)
api_basestringNoOverride the default API endpoint URL
proxystringNoHTTP proxy URL for this model entry
user_agentstringNoCustom User-Agent header sent with API requests (supported by OpenAI-compatible, Gemini, Anthropic, and Azure providers)
request_timeoutintNoRequest timeout in seconds (default varies by provider)
max_tokens_fieldstringNoOverride the max tokens field name in request body (e.g., max_completion_tokens for o1 models)
thinking_levelstringNoExtended thinking level: off, low, medium, high, xhigh, or adaptive
extra_bodyobjectNoAdditional fields to inject into every request body
custom_headersobjectNoAdditional HTTP headers to inject into every request (e.g., {"X-Source":"coding-plan"}). If a key matches a built-in header, the custom value overrides the built-in one (e.g., Authorization, User-Agent, Content-Type, Accept).
rpmintNoPer-minute request rate limit
fallbacksstring[]NoFallback model names for automatic failover
enabledboolNoWhether this model entry is active (default: true)

Provider / Model Resolution

PicoClaw resolves provider and the runtime model ID using these rules:

  • If provider is set, model is used as-is.
  • If provider is omitted, PicoClaw treats the first / segment in model as the provider and everything after that first / as the runtime model ID.

Examples:

ConfigResolved ProviderModel Sent Upstream
"provider": "openai", "model": "gpt-5.4"openaigpt-5.4
"model": "openai/gpt-5.4"openaigpt-5.4
"provider": "openrouter", "model": "openai/gpt-5.4"openrouteropenai/gpt-5.4
"model": "openrouter/openai/gpt-5.4"openrouteropenai/gpt-5.4

Voice Transcription

You can configure a dedicated model for audio transcription with voice.model_name. This lets you reuse existing multimodal providers that support audio input instead of relying only on Groq.

If voice.model_name is not configured, PicoClaw will continue to fall back to Groq transcription when a Groq API key is available.

json
{
  "model_list": [
    {
      "model_name": "voice-gemini",
      "provider": "gemini",
      "model": "gemini-2.5-flash",
      "api_keys": ["your-gemini-key"]
    }
  ],
  "voice": {
    "model_name": "voice-gemini",
    "echo_transcription": false
  },
  "providers": {
    "groq": {
      "api_key": "gsk_xxx"
    }
  }
}

Vendor-Specific Examples

OpenAI

json
{
  "model_name": "gpt-5.4",
  "provider": "openai",
  "model": "gpt-5.4",
  "api_keys": ["sk-..."]
}

VolcEngine (Doubao)

json
{
  "model_name": "ark-code-latest",
  "provider": "volcengine",
  "model": "ark-code-latest",
  "api_keys": ["sk-..."]
}

ๆ™บ่ฐฑ AI (GLM)

json
{
  "model_name": "glm-4.7",
  "provider": "zhipu",
  "model": "glm-4.7",
  "api_keys": ["your-key"]
}

Z.AI Coding Plan (GLM)

Z.AI and ๆ™บ่ฐฑ AI are two brands of the same provider. For the Z.AI Coding Plan use the openai model key and the api base as follows, rather than the zhipu config

json
{
  "model_name": "glm-4.7",
  "provider": "openai",
  "model": "glm-4.7",
  "api_keys": ["your-z.ai-key"],
  "api_base": "https://api.z.ai/api/coding/paas/v4"
}

DeepSeek

json
{
  "model_name": "deepseek-chat",
  "provider": "deepseek",
  "model": "deepseek-chat",
  "api_keys": ["sk-..."]
}

Anthropic (with API key)

json
{
  "model_name": "claude-sonnet-4.6",
  "provider": "anthropic",
  "model": "claude-sonnet-4.6",
  "api_keys": ["sk-ant-your-key"]
}

Run picoclaw auth login --provider anthropic to paste your API token.

Anthropic Messages API (native format)

For direct Anthropic API access or custom endpoints that only support Anthropic's native message format:

json
{
  "model_name": "claude-opus-4-6",
  "provider": "anthropic-messages",
  "model": "claude-opus-4-6",
  "api_keys": ["sk-ant-your-key"],
  "api_base": "https://api.anthropic.com"
}

Use anthropic-messages protocol when:

  • Using third-party proxies that only support Anthropic's native /v1/messages endpoint (not OpenAI-compatible /v1/chat/completions)
  • Connecting to services like MiniMax, Synthetic that require Anthropic's native message format
  • The existing anthropic protocol returns 404 errors (indicating the endpoint doesn't support OpenAI-compatible format)

Note: The anthropic protocol uses OpenAI-compatible format (/v1/chat/completions), while anthropic-messages uses Anthropic's native format (/v1/messages). Choose based on your endpoint's supported format.

Ollama (local)

json
{
  "model_name": "llama3",
  "provider": "ollama",
  "model": "llama3"
}

LM Studio (local)

json
{
  "model_name": "lmstudio-local",
  "provider": "lmstudio",
  "model": "openai/gpt-oss-20b"
}

api_base defaults to http://localhost:1234/v1. API key is optional unless your LM Studio server enables authentication.

With explicit provider, PicoClaw sends openai/gpt-oss-20b unchanged to the LM Studio server. The legacy compatibility form "model": "lmstudio/openai/gpt-oss-20b" still resolves to the same upstream model ID when provider is omitted.

Custom Proxy/API

json
{
  "model_name": "my-custom-model",
  "provider": "openai",
  "model": "custom-model",
  "api_base": "https://my-proxy.com/v1",
  "api_keys": ["sk-..."],
  "user_agent": "MyApp/1.0",
  "request_timeout": 300
}

LiteLLM Proxy

json
{
  "model_name": "lite-gpt4",
  "provider": "litellm",
  "model": "lite-gpt4",
  "api_base": "http://localhost:4000/v1",
  "api_keys": ["sk-..."]
}

With explicit provider, PicoClaw sends model unchanged. That means "provider": "litellm", "model": "lite-gpt4" sends lite-gpt4, while "provider": "litellm", "model": "openai/gpt-4o" sends openai/gpt-4o. The legacy compatibility forms litellm/lite-gpt4 and litellm/openai/gpt-4o still resolve the same way when provider is omitted.

Z.AI Coding Plan

If the standard Zhipu endpoint (https://open.bigmodel.cn/api/paas/v4) returns 429 (code 1113: insufficient balance), try using the Z.AI Coding Plan endpoint instead:

json
{
  "model_name": "glm-4.7",
  "provider": "openai",
  "model": "glm-4.7",
  "api_keys": ["your-zhipu-api-key"],
  "api_base": "https://api.z.ai/api/coding/paas/v4"
}

Note: The Z.AI Coding Plan endpoint and standard Zhipu endpoint use the same API key format but have separate billing. If you encounter 429 errors with the standard Zhipu endpoint, the Z.AI Coding Plan endpoint may have available balance.

Load Balancing

Configure multiple endpoints for the same model nameโ€”PicoClaw will automatically round-robin between them:

json
{
  "model_list": [
    {
      "model_name": "gpt-5.4",
      "provider": "openai",
      "model": "gpt-5.4",
      "api_base": "https://api1.example.com/v1",
      "api_keys": ["sk-key1"]
    },
    {
      "model_name": "gpt-5.4",
      "provider": "openai",
      "model": "gpt-5.4",
      "api_base": "https://api2.example.com/v1",
      "api_keys": ["sk-key2"]
    }
  ]
}

Automatic Model Failover (Cascade)

PicoClaw already supports automatic failover when you configure primary + fallbacks in the agent model settings. The runtime fallback chain retries the next candidate for retriable failures such as HTTP 429, quota/rate-limit errors, and timeout errors. It also applies cooldown tracking per candidate to avoid immediately retrying a recently failed target.

json
{
  "model_list": [
    {
      "model_name": "qwen-main",
      "provider": "openai",
      "model": "qwen3.5:cloud",
      "api_base": "https://api.example.com/v1",
      "api_keys": ["sk-main"]
    },
    {
      "model_name": "deepseek-backup",
      "provider": "deepseek",
      "model": "deepseek-chat",
      "api_keys": ["sk-backup-1"]
    },
    {
      "model_name": "gemini-backup",
      "provider": "gemini",
      "model": "gemini-2.5-flash",
      "api_keys": ["sk-backup-2"]
    }
  ],
  "agents": {
    "defaults": {
      "model": {
        "primary": "qwen-main",
        "fallbacks": ["deepseek-backup", "gemini-backup"]
      }
    }
  }
}

If you use key-level failover for the same model, PicoClaw can chain through additional key-backed candidates before moving to cross-model backups.

Migration from Legacy providers Config

The old providers configuration is deprecated and has been removed in V2. Existing V0/V1 configs are auto-migrated.

Old Config (deprecated):

json
{
  "providers": {
    "zhipu": {
      "api_key": "your-key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  },
  "agents": {
    "defaults": {
      "provider": "zhipu",
      "model": "glm-4.7"
    }
  }
}

New Config (recommended):

json
{
  "version": 3,
  "model_list": [
    {
      "model_name": "glm-4.7",
      "provider": "zhipu",
      "model": "glm-4.7",
      "api_keys": ["your-key"]
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "glm-4.7"
    }
  }
}

For detailed migration guide, see migration/model-list-migration.md.

Provider Architecture

PicoClaw routes providers by protocol family:

  • OpenAI-compatible protocol: OpenRouter, OpenAI-compatible gateways, Groq, Zhipu, and vLLM-style endpoints.
  • Gemini native protocol: Google Gemini via the native models/*:generateContent and models/*:streamGenerateContent endpoints.
  • Anthropic protocol: Claude-native API behavior.
  • Codex/OAuth path: OpenAI OAuth/token authentication route.

This keeps the runtime lightweight while making new OpenAI-compatible backends mostly a config operation (api_base + api_keys).

<details> <summary><b>Zhipu</b></summary>

1. Get API key and base URL

2. Configure

json
{
  "agents": {
    "defaults": {
      "workspace": "~/.picoclaw/workspace",
      "model_name": "glm-4.7",
      "max_tokens": 8192,
      "temperature": 0.7,
      "max_tool_iterations": 20
    }
  },
  "providers": {
    "zhipu": {
      "api_key": "Your API Key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  }
}

3. Run

bash
picoclaw agent -m "Hello"
</details> <details> <summary><b>Full config example</b></summary>
json
{
  "agents": {
    "defaults": {
      "model_name": "claude-opus-4-5"
    }
  },
  "session": {
    "dm_scope": "per-channel-peer"
  },
  "providers": {
    "openrouter": {
      "api_key": "sk-or-v1-xxx"
    },
    "groq": {
      "api_key": "gsk_xxx"
    }
  },
  "voice": {
    "model_name": "voice-gemini",
    "echo_transcription": false
  },
  "channel_list": {
    "telegram": {
      "enabled": true,
      "type": "telegram",
      "token": "123456:ABC...",
      "allow_from": ["123456789"]
    },
    "discord": {
      "enabled": true,
      "type": "discord",
      "token": "",
      "allow_from": [""]
    },
    "whatsapp": {
      "enabled": false,
      "type": "whatsapp",
      "bridge_url": "ws://localhost:3001",
      "use_native": false,
      "session_store_path": "",
      "allow_from": []
    },
    "feishu": {
      "enabled": false,
      "type": "feishu",
      "app_id": "cli_xxx",
      "app_secret": "xxx",
      "encrypt_key": "",
      "verification_token": "",
      "allow_from": []
    },
    "qq": {
      "enabled": false,
      "type": "qq",
      "app_id": "",
      "app_secret": "",
      "allow_from": []
    }
  },
  "tools": {
    "web": {
      "brave": {
        "enabled": false,
        "api_key": "BSA...",
        "max_results": 5
      },
      "duckduckgo": {
        "enabled": true,
        "max_results": 5
      },
      "perplexity": {
        "enabled": false,
        "api_key": "",
        "max_results": 5
      },
      "searxng": {
        "enabled": false,
        "base_url": "http://localhost:8888",
        "max_results": 5
      }
    },
    "cron": {
      "exec_timeout_minutes": 5
    }
  },
  "heartbeat": {
    "enabled": true,
    "interval": 30
  }
}
</details>

๐Ÿ“ API Key Comparison

ServicePricingUse Case
OpenRouterFree: 200K tokens/monthMultiple models (Claude, GPT-4, etc.)
Volcengine CodingPlanยฅ9.9/first monthBest for Chinese users, multiple SOTA models (Doubao, DeepSeek, etc.)
ZhipuFree: 200K tokens/monthSuitable for Chinese users
Brave Search$5/1000 queriesWeb search functionality
SearXNGFree (self-hosted)Privacy-focused metasearch (70+ engines)
GroqFree tier availableFast inference (Llama, Mixtral)
CerebrasFree tier availableFast inference (Llama, Qwen, etc.)
LongCatFree: up to 5M tokens/dayFast inference
ModelScopeFree: 2000 requests/dayInference (Qwen, GLM, DeepSeek, etc.)

<div align="center"> </div>