docs/provider-config/nebius.mdx
Nebius AI Studio provides inference for a wide range of open-source models including DeepSeek, Qwen, Llama, and others, with competitive pricing and fast/standard speed tiers.
Website: https://studio.nebius.com/
Cline supports the following Nebius models:
deepseek-ai/DeepSeek-V3 - General-purpose model ($0.50/$1.50 per 1M tokens)deepseek-ai/DeepSeek-V3-0324-fast - Fast variant ($2.00/$6.00 per 1M tokens)deepseek-ai/DeepSeek-R1 - Reasoning model ($0.80/$2.40 per 1M tokens)deepseek-ai/DeepSeek-R1-fast - Fast reasoning ($2.00/$6.00 per 1M tokens)deepseek-ai/DeepSeek-R1-0528 - Latest reasoning version (163K context, $0.80/$2.40 per 1M tokens)deepseek-ai/DeepSeek-R1-0528-fast - Fast latest reasoning ($2.00/$6.00 per 1M tokens)Qwen/Qwen3-Coder-480B-A35B-Instruct - 480B coding model (262K context, $0.40/$1.80 per 1M tokens)Qwen/Qwen3-235B-A22B - 235B MoE model ($0.20/$0.60 per 1M tokens)Qwen/Qwen3-235B-A22B-Instruct-2507 - Latest instruct version (262K context, $0.20/$0.60 per 1M tokens)Qwen/Qwen3-32B / Qwen/Qwen3-32B-fast - Dense 32B modelQwen/Qwen3-30B-A3B / Qwen/Qwen3-30B-A3B-fast - Compact MoE modelQwen/Qwen3-4B-fast - Small fast model ($0.08/$0.24 per 1M tokens)Qwen/Qwen2.5-Coder-32B-Instruct-fast - Coding-optimized ($0.10/$0.30 per 1M tokens)Qwen/Qwen2.5-32B-Instruct-fast (Default) - General-purpose ($0.13/$0.40 per 1M tokens)moonshotai/Kimi-K2-Instruct - Kimi K2 with prompt caching (131K context, $0.50/$2.40 per 1M tokens)openai/gpt-oss-120b - OpenAI's 120B open-weight model ($0.15/$0.60 per 1M tokens)openai/gpt-oss-20b - OpenAI's 20B open-weight model ($0.05/$0.20 per 1M tokens)zai-org/GLM-4.5 / zai-org/GLM-4.5-Air - Z AI models with prompt cachingmeta-llama/Llama-3.3-70B-Instruct-fast - Fast Llama 3.3 ($0.25/$0.75 per 1M tokens)-fast suffix offer faster inference at higher prices.