docs/api/anthropic-compatibility.mdx
Ollama provides compatibility with the Anthropic Messages API to help connect existing applications to Ollama, including tools like Claude Code.
To use Ollama with tools that expect the Anthropic API (like Claude Code), set these environment variables:
export ANTHROPIC_AUTH_TOKEN=ollama # required but ignored
export ANTHROPIC_BASE_URL=http://localhost:11434
/v1/messages exampleimport anthropic
client = anthropic.Anthropic(
base_url='http://localhost:11434',
api_key='ollama', # required but ignored
)
message = client.messages.create(
model='qwen3-coder',
max_tokens=1024,
messages=[
{'role': 'user', 'content': 'Hello, how are you?'}
]
)
print(message.content[0].text)
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
baseURL: "http://localhost:11434",
apiKey: "ollama", // required but ignored
});
const message = await anthropic.messages.create({
model: "qwen3-coder",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello, how are you?" }],
});
console.log(message.content[0].text);
curl -X POST http://localhost:11434/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: ollama" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "qwen3-coder",
"max_tokens": 1024,
"messages": [{ "role": "user", "content": "Hello, how are you?" }]
}'
import anthropic
client = anthropic.Anthropic(
base_url='http://localhost:11434',
api_key='ollama',
)
with client.messages.stream(
model='qwen3-coder',
max_tokens=1024,
messages=[{'role': 'user', 'content': 'Count from 1 to 10'}]
) as stream:
for text in stream.text_stream:
print(text, end='', flush=True)
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
baseURL: "http://localhost:11434",
apiKey: "ollama",
});
const stream = await anthropic.messages.stream({
model: "qwen3-coder",
max_tokens: 1024,
messages: [{ role: "user", content: "Count from 1 to 10" }],
});
for await (const event of stream) {
if (
event.type === "content_block_delta" &&
event.delta.type === "text_delta"
) {
process.stdout.write(event.delta.text);
}
}
curl -X POST http://localhost:11434/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder",
"max_tokens": 1024,
"stream": true,
"messages": [{ "role": "user", "content": "Count from 1 to 10" }]
}'
import anthropic
client = anthropic.Anthropic(
base_url='http://localhost:11434',
api_key='ollama',
)
message = client.messages.create(
model='qwen3-coder',
max_tokens=1024,
tools=[
{
'name': 'get_weather',
'description': 'Get the current weather in a location',
'input_schema': {
'type': 'object',
'properties': {
'location': {
'type': 'string',
'description': 'The city and state, e.g. San Francisco, CA'
}
},
'required': ['location']
}
}
],
messages=[{'role': 'user', 'content': "What's the weather in San Francisco?"}]
)
for block in message.content:
if block.type == 'tool_use':
print(f'Tool: {block.name}')
print(f'Input: {block.input}')
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
baseURL: "http://localhost:11434",
apiKey: "ollama",
});
const message = await anthropic.messages.create({
model: "qwen3-coder",
max_tokens: 1024,
tools: [
{
name: "get_weather",
description: "Get the current weather in a location",
input_schema: {
type: "object",
properties: {
location: {
type: "string",
description: "The city and state, e.g. San Francisco, CA",
},
},
required: ["location"],
},
},
],
messages: [{ role: "user", content: "What's the weather in San Francisco?" }],
});
for (const block of message.content) {
if (block.type === "tool_use") {
console.log("Tool:", block.name);
console.log("Input:", block.input);
}
}
curl -X POST http://localhost:11434/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder",
"max_tokens": 1024,
"tools": [
{
"name": "get_weather",
"description": "Get the current weather in a location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state"
}
},
"required": ["location"]
}
}
],
"messages": [{ "role": "user", "content": "What is the weather in San Francisco?" }]
}'
Claude Code can be configured to use Ollama as its backend.
For coding use cases, models like glm-4.7, minimax-m2.1, and qwen3-coder are recommended.
Download a model before use:
ollama pull qwen3-coder
Note: Qwen 3 coder is a 30B parameter model requiring at least 24GB of VRAM to run smoothly. More is required for longer context lengths.
ollama pull glm-4.7:cloud
ollama launch claude
This will prompt you to select a model, configure Claude Code automatically, and launch it. To configure without launching:
ollama launch claude --config
Set the environment variables and run Claude Code:
ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 claude --model qwen3-coder
Or set the environment variables in your shell profile:
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434
Then run Claude Code with any Ollama model:
claude --model qwen3-coder
/v1/messagesmodelmax_tokensmessages
contentcontent (base64)tool_use blockstool_result blocksthinking blockssystem (string or array)streamtemperaturetop_ptop_kstop_sequencestoolsthinkingtool_choicemetadataidtyperolemodelcontent (text, tool_use, thinking blocks)stop_reason (end_turn, max_tokens, tool_use)usage (input_tokens, output_tokens)message_startcontent_block_startcontent_block_delta (text_delta, input_json_delta, thinking_delta)content_block_stopmessage_deltamessage_stoppingerrorOllama supports both local and cloud models.
Pull a local model before use:
ollama pull qwen3-coder
Recommended local models:
qwen3-coder - Excellent for coding tasksgpt-oss:20b - Strong general-purpose modelCloud models are available immediately without pulling:
glm-4.7:cloud - High-performance cloud modelminimax-m2.1:cloud - Fast cloud modelFor tooling that relies on default Anthropic model names such as claude-3-5-sonnet, use ollama cp to copy an existing model name:
ollama cp qwen3-coder claude-3-5-sonnet
Afterwards, this new model name can be specified in the model field:
curl http://localhost:11434/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
anthropic-version header is accepted but not usedThe following Anthropic API features are not currently supported:
| Feature | Description |
|---|---|
/v1/messages/count_tokens | Token counting endpoint |
tool_choice | Forcing specific tool use or disabling tools |
metadata | Request metadata (user_id) |
| Prompt caching | cache_control blocks for caching prefixes |
| Batches API | /v1/messages/batches for async batch processing |
| Citations | citations content blocks |
| PDF support | document content blocks with PDF files |
| Server-sent errors | error events during streaming (errors return HTTP status) |
| Feature | Status |
|---|---|
| Image content | Base64 images supported; URL images not supported |
| Extended thinking | Basic support; budget_tokens accepted but not enforced |