Back to Khoj

Ollama

documentation/docs/advanced/ollama.mdx

1.42.103.5 KB
Original Source

Ollama

mdx-code-block
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

:::info This is only helpful for self-hosted users. If you're using Khoj Cloud, you can use our first-party supported models. :::

:::info Khoj can directly run local LLMs available on HuggingFace in GGUF format. The integration with Ollama is useful to run Khoj on Docker and have the chat models use your GPU or to try new models via CLI. :::

Ollama allows you to run many popular open-source LLMs locally from your terminal. For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.

Ollama exposes a local OpenAI API compatible server. This makes it possible to use chat models from Ollama with Khoj.

Setup

:::info Restart your Khoj server after first run or update to the settings below to ensure all settings are applied correctly. :::

<Tabs groupId="type" queryString> <TabItem value="first-run" label="First Run"> <Tabs groupId="server" queryString> <TabItem value="docker" label="Docker"> 1. Setup Ollama: https://ollama.com/ 2. Download your preferred chat model with Ollama. For example, ```bash ollama pull llama3.1 ``` 3. Uncomment `OPENAI_BASE_URL` environment variable in your downloaded Khoj [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml#:~:text=OPENAI_BASE_URL) 4. Start Khoj docker for the first time to automatically integrate and load models from the Ollama running on your host machine ```bash # run below command in the directory where you downloaded the Khoj docker-compose.yml docker-compose up ``` </TabItem>
  <TabItem value="pip" label="Pip">
  1. Setup Ollama: https://ollama.com/
  2. Download your preferred chat model with Ollama. For example,
     ```bash
     ollama pull llama3.1
     ```
  3. Set `OPENAI_BASE_URL` environment variable to `http://localhost:11434/v1/` in your shell before starting Khoj for the first time
     ```bash
     export OPENAI_BASE_URL="http://localhost:11434/v1/"
     khoj --anonymous-mode
     ```
  </TabItem>
</Tabs> </TabItem> <TabItem value="update" label="Update"> 1. Setup Ollama: https://ollama.com/ 2. Download your preferred chat model with Ollama. For example, ```bash ollama pull llama3.1 ``` 3. Create a new [AI Model API](http://localhost:42110/server/admin/database/aimodelapi/add) on your Khoj admin panel - **Name**: `ollama` - **Api Key**: `any string` - **Api Base Url**: `http://localhost:11434/v1/` (default for Ollama) 4. Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel. - **Name**: `llama3.1` (replace with the name of your local model) - **Model Type**: `Openai` - **AI Model API**: *the ollama AI Model API you created in step 3* - **Max prompt size**: `20000` (replace with the max prompt size of your model) 5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

If you want to add additional models running on Ollama, repeat step 4 for each model. </TabItem> </Tabs>

That's it! You should now be able to chat with your Ollama model from Khoj.