Back to Openclaw

llama.cpp Provider

docs/plugins/llama-cpp.md

2026.6.61.2 KB
Original Source

llama-cpp is the official external provider plugin for local GGUF embeddings. It owns the node-llama-cpp runtime dependency used by memorySearch.provider: "local".

Install it before using local memory embeddings:

bash
openclaw plugins install @openclaw/llama-cpp-provider

The main openclaw npm package does not include node-llama-cpp. Keeping the native dependency in this plugin prevents normal OpenClaw npm updates from deleting a manually installed runtime inside the OpenClaw package directory.

Configuration

Set the memory search provider to local:

json5
{
  agents: {
    defaults: {
      memorySearch: {
        provider: "local",
        local: {
          modelPath: "hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf",
        },
      },
    },
  },
}

The default model is embeddinggemma-300m-qat-Q8_0.gguf. You can also point local.modelPath at a local .gguf file.

Native Runtime

Use Node 24 for the smoothest native install path. Source checkouts using pnpm may need to approve and rebuild the native dependency:

bash
pnpm approve-builds
pnpm rebuild node-llama-cpp

For lower-friction local embeddings, use a local service provider such as Ollama or LM Studio instead.