Back to Chatgpt On Wechat

Zhipu GLM

docs/en/models/glm.mdx

2.0.91.8 KB
Original Source

Zhipu AI supports text chat, image understanding, speech-to-text (ASR), and embedding. A single zhipu_ai_api_key enables all capabilities.

<Tip> All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file. </Tip>

Text Chat

json
{
  "model": "glm-5.1",
  "zhipu_ai_api_key": "YOUR_API_KEY"
}
ParameterDescription
modelCan be glm-5.1, glm-5-turbo, glm-5, glm-4.7, glm-4-plus, glm-4-flash, glm-4-air, etc. See model codes
zhipu_ai_api_keyCreate one in the Zhipu AI Console
zhipu_ai_api_baseOptional, defaults to https://open.bigmodel.cn/api/paas/v4

Image Understanding

Zhipu's chat models (glm-5.1, glm-5-turbo, etc.) do not support vision; vision calls are uniformly routed to glm-5v-turbo. Once zhipu_ai_api_key is configured, the Agent's Vision tool automatically uses this model, with no need to specify it explicitly in the configuration file.

Speech-to-Text (ASR)

json
{
  "voice_to_text": "zhipu",
  "voice_to_text_model": "glm-asr-2512"
}
ParameterDescription
voice_to_textSet to zhipu to enable Zhipu ASR
voice_to_text_modelOptional, defaults to glm-asr-2512

Credentials are automatically reused from zhipu_ai_api_key. Audio files should be smaller than 25MB; oversized files may be rejected by the server.

Embedding

json
{
  "embedding_provider": "zhipu",
  "embedding_model": "embedding-3"
}

Available models: embedding-3, embedding-2. After changing the embedding, run /memory rebuild-index to rebuild the index.