docs/model-config/context-windows.mdx
A context window is the maximum amount of text an AI model can process at once. Think of it as the model's "working memory" - it determines how much of your conversation and code the model can consider when generating responses.
<Note> **Key Point**: Larger context windows allow the model to understand more of your codebase at once, but may increase costs and response times. </Note>| Size | Tokens | Approximate Words | Use Case |
|---|---|---|---|
| Small | 8K-32K | 6,000-24,000 | Single files, quick fixes |
| Medium | 128K | ~96,000 | Most coding projects |
| Large | 200K | ~150,000 | Complex codebases |
| Extra Large | 400K+ | ~300,000+ | Entire applications |
| Massive | 1M+ | ~750,000+ | Multi-project analysis |
| Model | Context Window | Effective Window* | Notes |
|---|---|---|---|
| Claude Sonnet 4.5 | 1M tokens | ~500K tokens | Best quality at high context |
| GPT-5 | 400K tokens | ~300K tokens | Three modes affect performance |
| Gemini 2.5 Pro | 1M+ tokens | ~600K tokens | Excellent for documents |
| DeepSeek V3 | 128K tokens | ~100K tokens | Optimal for most tasks |
| Qwen3 Coder | 256K tokens | ~200K tokens | Good balance |
*Effective window is where model maintains high quality
/new - Creates a new task with clean context
Benefits:
Instead of including entire files:
@filename.ts - Include only when neededCline can automatically summarize long conversations:
| Warning Sign | What It Means | Solution |
|---|---|---|
| "Context window exceeded" | Hard limit reached | Start new task or enable auto-compact |
| Slower responses | Model struggling with context | Reduce included files |
| Repetitive suggestions | Context fragmentation | Summarize and start fresh |
| Missing recent changes | Context overflow | Use checkpoints to track changes |
Leverage Plan/Act mode for better context usage:
Configuration:
Plan Mode: DeepSeek V3 (128K) - Lower cost planning
Act Mode: Claude Sonnet (1M) - Maximum context for coding
| File Type | Tokens per KB |
|---|---|
| Code | ~250-400 |
| JSON | ~300-500 |
| Markdown | ~200-300 |
| Plain text | ~200-250 |
A: Models can lose focus with too much context. The "effective window" is typically 50-70% of the advertised limit.
A: Not always. Larger contexts increase cost and can reduce response quality. Match the context to your task size.
A: Cline shows token usage in the interface. Watch for the context meter approaching limits.
A: Cline will either:
| Use Case | Recommended Context | Model Suggestion |
|---|---|---|
| Quick fixes | 32K-128K | DeepSeek V3 |
| Feature development | 128K-200K | Qwen3 Coder |
| Large refactoring | 400K+ | Claude Sonnet 4.5 |
| Code review | 200K-400K | GPT-5 |
| Documentation | 128K | Any budget model |