docs/provider-config/zai.mdx
Z AI (formerly Zhipu AI) offers the GLM model series, featuring hybrid reasoning capabilities and agentic AI design. These models excel in unified reasoning, coding, and intelligent agent applications while maintaining open-source accessibility under MIT license.
Website: https://z.ai/model-api (International) | https://open.bigmodel.cn/ (China)
Z AI provides different model catalogs based on your selected region. Both regions share the same model lineup:
glm-5.1 (Default) - Latest flagship model with 200K context window, 128K maximum output, and prompt caching ($1.40/$4.40 per 1M tokens; cached input $0.26 per 1M tokens)glm-5 - Flagship model with 200K context window and prompt caching ($1.00/$3.20 per 1M tokens)glm-4.7 - High-performance model with 200K context and prompt caching ($0.60/$2.20 per 1M tokens)glm-4.6 - Advanced model with 200K context and prompt caching ($0.60/$2.20 per 1M tokens)glm-4.5 - Flagship model with 131K context, prompt caching, and hybrid reasoningglm-4.5-air - Compact, cost-effective model with 128K context and prompt cachingAll models feature:
Note: Pricing differs between International and China regions. China region pricing is approximately 50% lower.
Z AI offers subscription plans specifically designed for coding applications. These plans provide cost-effective access to GLM-4.5 models through a prompt-based structure rather than traditional API usage billing.
GLM Coding Lite - $3/month
GLM Coding Pro - $15/month
Both plans offer promotional pricing for the first month: Lite drops from $6 to $3, Pro drops from $30 to $15.
<Frame> </Frame>To use the GLM Coding Plans with Cline:
Subscribe: Go to https://z.ai/subscribe and choose your plan.
Create API Key: After subscribing, log into your zAI dashboard and create an API key for your coding plan.
Configure in Cline: Open Cline settings, select "Z AI" as your provider, and paste your API key into the "Z AI API Key" field.
The setup connects your subscription directly to Cline, giving you access to GLM-4.5's tool-calling capabilities optimized for coding workflows.
Z AI's GLM-4.5 series introduces revolutionary capabilities that set it apart from conventional language models:
GLM-4.5 operates in two distinct modes:
This dual-mode architecture represents an "agent-native" design philosophy that adapts processing intensity based on query complexity.
GLM-4.5 achieves a comprehensive score of 63.2 across 12 benchmarks spanning agentic tasks, reasoning, and coding challenges, securing 3rd place among all proprietary and open-source models. GLM-4.5-Air maintains competitive performance with a score of 59.8 while delivering superior efficiency.
The sophisticated MoE architecture optimizes performance while maintaining computational efficiency:
The 128,000-token context window enables comprehensive understanding of lengthy documents and codebases, with real-world testing confirming effective processing of nearly 2,000-line codebases while maintaining remarkable performance.
Released under MIT license, GLM-4.5 provides researchers and developers with access to state-of-the-art capabilities without proprietary restrictions, including base models, hybrid reasoning versions, and optimized FP8 variants.
https://api.z.ai/api/paas/v4https://open.bigmodel.cn/api/paas/v4The region setting determines both API endpoint and available models, with automatic filtering to ensure compatibility with your selected region.
GLM-4.5's unified architecture makes it particularly suitable for complex intelligent agent applications requiring integrated reasoning, coding, and tool utilization capabilities.
Performance evaluation encompasses:
This comprehensive assessment demonstrates versatility across diverse AI applications.
Models support integration through multiple frameworks:
Complete with dedicated model code, tool parser, and reasoning parser implementations.
GLM-4.5 shows competitive performance in agentic coding and reasoning tasks, though Claude Sonnet 4 maintains advantages in coding success rates and autonomous multi-feature application development.
GLM-4.5 ranks competitively in reasoning and agent benchmarks, with GPT-4.5 generally leading in raw task accuracy on professional benchmarks like MMLU and AIME.