docs/running-models-locally/ollama.mdx
Browse models at ollama.com/search
Select model and copy command:
ollama run [model-name]
Open your Terminal and run the command:
Example:
ollama run llama2
Your model is now ready to use within Cline.
Open VS Code and configure Cline:
http://localhost:11434/ (default, usually no need to change)For the best experience with Cline, use Qwen 2.5 Coder 32B. This model provides strong coding capabilities and reliable tool use for local development.
To download it:
ollama pull qwen2.5-coder:32b
Other capable models include:
mistral-small:latest - Good balance of performance and speedcodellama:34b-code - Optimized for coding tasksFor better performance with local models, enable compact prompts in Cline settings. This reduces the prompt size by 90% while maintaining core functionality.
Navigate to Cline Settings → Features → Use Compact Prompt and toggle it on.
If Cline can't connect to Ollama:
Need more info? Read the Ollama Docs.