plugins/plugin-ollama/README.md
This plugin connects Ollama to ElizaOS so agents can use local LLMs for text, embeddings, and JSON object generation—without sending prompts to a third-party API.
ollama create eliza-1-2b -f packages/training/cloud/ollama/Modelfile.eliza-1-2b-q4_k_m.bun add @elizaos/plugin-ollama
Ensure Ollama is running:
ollama serve
Register the plugin on your character / app config (exact shape depends on your ElizaOS version):
"plugins": ["@elizaos/plugin-ollama"]
ollama-ai-provider-v2Handlers use generateText (completion, structured generation, and non-streaming fallbacks for schema-only stream contexts or toolChoice-only requests), streamText (plain SSE chat or stream: true with native tools), generateObject, and embed from the ai package, backed by ollama-ai-provider-v2.
TextStreamResult)When useModel is invoked with stream: true during an SSE reply, AgentRuntime only iterates textStream and forwards chunks if the model handler returns a TextStreamResult (see isTextStreamResult in packages/core/src/runtime.ts). Why: returning a bare string skips that branch entirely—tokens still generate, but the UI and conversation routes can log “no streamed text” because no chunks were delivered.
This plugin returns TextStreamResult from streamText when stream: true and either:
responseSchema, tools, or toolChoice — every text delta is yielded to textStream (normal SSE).RESPONSE_HANDLER / ACTION_PLANNER, useModel’s streaming path concatenates only textStream chunks into the string passed to parseMessageHandlerOutput, so this adapter drains model text deltas internally and yields a single trailing chunk of the first tool’s arguments JSON (the v5 plan). Why: prepending arbitrary streamed text would break JSON.parse on that accumulated string. Other TEXT_* types forward all text chunks and attach toolCalls on the result (parity with OpenAI/OpenRouter).Still generateText: stream: true with responseSchema only (no tools), e.g. nested FACT_EXTRACTOR during SSE — stream may still be true on params; we do not throw and we log at debug because structured format: json is not wired through streamText here.
Still generateText: stream: true with toolChoice but no resolved ToolSet on the wire — log at debug, then generateText. Why: the AI SDK’s streamText path used here expects tools in the same request; toolChoice alone is invalid for streaming and is not produced by core v5 (Stage 1 always passes tools with toolChoice).
stream | tools on wire | responseSchema / toolChoice | Path | Why |
|---|---|---|---|---|
true | yes | (any; schema dropped if both) | streamText + tools | Same surface as other provider plugins; Ollama v2 supports tools on streaming /api/chat. |
true | no | no schema, no toolChoice | streamText plain | TextStreamResult so useModel can forward SSE chunks. |
true | no | schema only | generateText | Structured format is not implemented on streamText in this adapter; nested schema calls must not throw. |
true | no | toolChoice only | generateText | streamText+tools requires a tool set; log explains misconfiguration. |
false / absent | (any) | (any) | generateText (or structured serialize) | Normal completion path; Stage 1 without inherited streaming uses this. |
ollama-ai-provider-v2 (not the old ollama-ai-provider)ElizaOS tracks AI SDK 5/6. Older ollama-ai-provider exposed model specification v1; current ai only accepts v2+ models and throws:
Unsupported model version v1 for provider "ollama.chat".
The v2 provider implements the same provider contract as the rest of the ecosystem, so local Ollama behaves like other LLM backends from Eliza’s perspective.
See CHANGELOG.md for the full migration note.
Environment variables (or character settings with the same keys—why: lets you override per-agent without touching global .env):
| Variable | Default | Purpose |
|---|---|---|
OLLAMA_API_ENDPOINT | http://localhost:11434 (normalized to …/api) | Ollama HTTP API base. |
OLLAMA_SMALL_MODEL / SMALL_MODEL | eliza-1-2b | Small / fast text model. |
OLLAMA_LARGE_MODEL / LARGE_MODEL | eliza-1-4b | Larger text model. |
OLLAMA_EMBEDDING_MODEL | eliza-1-2b | Embedding model id. |
OLLAMA_DISABLE_STRUCTURED_OUTPUT | unset | If 1 / true / yes / on, disables JSON-schema structured text (see below). |
Optional model overrides: OLLAMA_NANO_MODEL, OLLAMA_MEDIUM_MODEL, OLLAMA_MEGA_MODEL, OLLAMA_RESPONSE_HANDLER_MODEL, OLLAMA_ACTION_PLANNER_MODEL (see utils/config.ts). Why separate keys: v5 Stage 1 (RESPONSE_HANDLER) is tool-heavy—you may want a larger tag than TEXT_SMALL without paying that cost on every TEXT_LARGE reply; planners often default to medium-sized models for latency.
.envOLLAMA_API_ENDPOINT=http://localhost:11434/api
OLLAMA_SMALL_MODEL=eliza-1-2b
OLLAMA_LARGE_MODEL=eliza-1-4b
OLLAMA_EMBEDDING_MODEL=eliza-1-2b
settings block{
"settings": {
"OLLAMA_API_ENDPOINT": "http://localhost:11434/api",
"OLLAMA_SMALL_MODEL": "eliza-1-2b",
"OLLAMA_LARGE_MODEL": "eliza-1-4b",
"OLLAMA_EMBEDDING_MODEL": "eliza-1-2b"
}
}
responseSchema)Eliza core passes responseSchema on text model calls when it needs machine-parseable JSON (e.g. fact extraction ops, structured planner/evaluator outputs).
This plugin maps that to the AI SDK’s Output.object({ schema: jsonSchema(...) }), which the Ollama provider turns into Ollama’s format field (json or JSON Schema, depending on Ollama version and request).
Why it matters: Without this path, those calls failed under Ollama and features silently degraded (warnings in logs, no memory updates).
Caveats:
stream: true with responseSchema: Core may set stream from an active chat streaming context even for nested calls (e.g. FACT_EXTRACTOR). The adapter does not throw; it runs non-streaming generateText with your schema. Why: the handler never used streamText for structured calls; rejecting only surfaced as spurious failures during SSE replies.OLLAMA_DISABLE_STRUCTURED_OUTPUTSet to 1, true, yes, or on if:
format / schema requests, orEffect: responseSchema is stripped; generation runs as plain text. Callers that require JSON may then fail parsing—why: this is an intentional escape hatch for operators, not a silent “fix” for broken prompts.
| Model type | Role |
|---|---|
TEXT_* | Chat / completion-style text. |
TEXT_EMBEDDING | Vector embeddings. |
OBJECT_* | generateObject with output: "no-schema" (best-effort JSON object). |
RESPONSE_HANDLER / ACTION_PLANNER | Same text path; v5 Stage 1 uses messages + tools + toolChoice; other pipelines may use responseSchema for JSON. |
const text = await runtime.useModel(ModelType.TEXT_LARGE, {
prompt: "Explain quantum tunneling briefly.",
maxTokens: 8192,
temperature: 0.7,
});
const embedding = await runtime.useModel(ModelType.TEXT_EMBEDDING, {
text: "hello world",
});
toolChoice (v5)ElizaOS v5 can call text models with messages (chat turns), tools (function / tool definitions), and toolChoice (for example "required" on the message-handler stage so the model must emit a planner tool call).
This plugin forwards those fields to the Vercel AI SDK generateText or streamText (when stream: true and tools are present) backed by ollama-ai-provider-v2, and when the call is “native shaped” (messages, tools, tool choice, or structured output), it may return a GenerateTextResult-like object at runtime or a TextStreamResult with toolCalls — TypeScript still types useModel text handlers as string for historical reasons.
Why cast instead of changing core types everywhere: OpenRouter and OpenAI adapters already use the same pattern—core’s v5 parser (parseMessageHandlerNativeToolCall, trajectory recording) expects toolCalls with id / name / arguments compatible fields. Matching that contract keeps Ollama on the same code path as cloud providers.
Tool definitions: Core usually passes ToolDefinition[] (array). Callers may also pass an AI SDK ToolSet object; both are accepted. Array entries are normalized to jsonSchema(...) parameters the way the OpenAI plugin does (without Cerebras-only name/schema tweaks).
When both tools and responseSchema are set: The adapter drops structured output for that request and keeps tools. Why: generateText cannot reasonably combine native tool calling and Output.object in one call for all models; v5 Stage 1 needs tools, so schema-backed structured output loses that race by design.
Streaming flag: See Streaming routing above. In prose: tools + stream: true prefer streamText; schema-only + stream: true stays on generateText; plain chat + stream: true returns TextStreamResult from streamText (same contract as OpenRouter).
providerOptions from GenerateTextParams are not merged into the Ollama generateText call yet. Core may attach cache-budget metadata for other providers; Ollama ignores those fields here until we wire them explicitly./v1) with another plugin if you need battle-tested tool UX.| Symptom | Likely cause | Mitigation |
|---|---|---|
| Chat UI empty / “no streamed text” with Ollama | Handler returned a string while stream: true | Fixed in current adapter: plain chat uses streamText → TextStreamResult. Schema-only streaming still uses generateText; tool calls with stream: true use streamText (planner types may only emit one trailing plan chunk on textStream). |
Unsupported model version v1 | Stale lockfile / wrong ollama-ai-provider | Run bun install at repo root; confirm dependency is ollama-ai-provider-v2. |
[Ollama] Native tools, toolChoice plumbing is not supported (older builds) | Pre–native-tools adapter | Upgrade @elizaos/plugin-ollama to a build that forwards tools (see CHANGELOG.md). |
v5 messageHandler returned invalid MessageHandlerResult | Model did not return the expected planner tool / JSON | Use a model with reliable tool calling; confirm OLLAMA_RESPONSE_HANDLER_MODEL is not too small for your prompt. |
| Fact / planner JSON errors | Model ignores schema | Try a stronger model; tighten prompts; or set OLLAMA_DISABLE_STRUCTURED_OUTPUT=1 temporarily. |
Debug: toolChoice but no tools on wire | stream: true with toolChoice but no ToolSet passed | Should not happen from core v5; if you see it, ensure tools and toolChoice are passed together. Adapter falls back to generateText. |
streamText.textStream failed / AI_NoOutputGeneratedError / process exit after Ollama 500 | Ollama returned an error body (e.g. insufficient system memory for the model) during a streaming /api/chat call; AI SDK retried then failed. | Check logs for ollamaResponseBody (plugin now extracts it). Free RAM on the Ollama host, use a smaller model, or lower concurrency. Streaming errors occur while useModel consumes textStream, not only inside generateText. |
| Connection errors | Ollama not running or wrong URL | curl the /api/tags endpoint; fix OLLAMA_API_ENDPOINT. |
More context: Ollama plugin docs (registry page: plugin-registry/llm/ollama).
Ollama’s HTTP API: Ollama API docs.
See CHANGELOG.md.
See LICENSE.