/ documentation / docs / advanced / ollama.mdx
ollama.mdx
 1  # Ollama
 2  
 3  ```mdx-code-block
 4  import Tabs from '@theme/Tabs';
 5  import TabItem from '@theme/TabItem';
 6  ```
 7  
 8  :::info
 9  This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you can use our first-party supported models.
10  :::
11  
12  :::info
13  Khoj can directly run local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). The integration with Ollama is useful to run Khoj on Docker and have the chat models use your GPU or to try new models via CLI.
14  :::
15  
16  Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
17  For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
18  
19  Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama with Khoj.
20  
21  ## Setup
22  :::info
23  Restart your Khoj server after first run or update to the settings below to ensure all settings are applied correctly.
24  :::
25  
26  <Tabs groupId="type" queryString>
27    <TabItem value="first-run" label="First Run">
28      <Tabs groupId="server" queryString>
29        <TabItem value="docker" label="Docker">
30        1. Setup Ollama: https://ollama.com/
31        2. Download your preferred chat model with Ollama. For example,
32           ```bash
33           ollama pull llama3.1
34           ```
35        3. Uncomment `OPENAI_BASE_URL` environment variable in your downloaded Khoj [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml#:~:text=OPENAI_BASE_URL)
36        4. Start Khoj docker for the first time to automatically integrate and load models from the Ollama running on your host machine
37           ```bash
38           # run below command in the directory where you downloaded the Khoj docker-compose.yml
39           docker-compose up
40           ```
41        </TabItem>
42  
43        <TabItem value="pip" label="Pip">
44        1. Setup Ollama: https://ollama.com/
45        2. Download your preferred chat model with Ollama. For example,
46           ```bash
47           ollama pull llama3.1
48           ```
49        3. Set `OPENAI_BASE_URL` environment variable to `http://localhost:11434/v1/` in your shell before starting Khoj for the first time
50           ```bash
51           export OPENAI_BASE_URL="http://localhost:11434/v1/"
52           khoj --anonymous-mode
53           ```
54        </TabItem>
55     </Tabs>
56    </TabItem>
57    <TabItem value="update" label="Update">
58     1. Setup Ollama: https://ollama.com/
59     2. Download your preferred chat model with Ollama. For example,
60        ```bash
61        ollama pull llama3.1
62        ```
63     3. Create a new [AI Model API](http://localhost:42110/server/admin/database/aimodelapi/add) on your Khoj admin panel
64        - **Name**: `ollama`
65        - **Api Key**: `any string`
66        - **Api Base Url**: `http://localhost:11434/v1/` (default for Ollama)
67     4. Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel.
68        - **Name**: `llama3.1` (replace with the name of your local model)
69        - **Model Type**: `Openai`
70        - **AI Model API**: *the ollama AI Model API you created in step 3*
71        - **Max prompt size**: `20000` (replace with the max prompt size of your model)
72     5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
73  
74     If you want to add additional models running on Ollama, repeat step 4 for each model.
75    </TabItem>
76  </Tabs>
77  
78  That's it! You should now be able to chat with your Ollama model from Khoj.