Cradicle Explorer

/ docs-website / versioned_docs / version-2.21 / pipeline-components / generators / ollamachatgenerator.mdx
ollamachatgenerator.mdx
  1  ---
  2  title: "OllamaChatGenerator"
  3  id: ollamachatgenerator
  4  slug: "/ollamachatgenerator"
  5  description: "This component enables chat completion using an LLM running on Ollama."
  6  ---
  7  
  8  # OllamaChatGenerator
  9  
 10  This component enables chat completion using an LLM running on Ollama.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | After a [ChatPromptBuilder](../builders/chatpromptbuilder.mdx)                                                 |
 17  | **Mandatory run variables**            | `messages`: A list of [`ChatMessage`](../../concepts/data-classes/chatmessage.mdx)  objects representing the chat |
 18  | **Output variables**                   | `replies`: A list of LLM’s alternative replies                                                       |
 19  | **API reference**                      | [Ollama](/reference/integrations-ollama)                                                                    |
 20  | **GitHub link**                        | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/ollama             |
 21  
 22  </div>
 23  
 24  ## Overview
 25  
 26  [Ollama](https://github.com/jmorganca/ollama) is a project focused on running LLMs locally. Internally, it uses the quantized GGUF format by default. This means it is possible to run LLMs on standard machines (even without GPUs) without having to handle complex installation procedures.
 27  
 28  `OllamaChatGenerator` supports models running on Ollama, such as `llama2` and `mixtral`. Find the full list of supported models [here](https://ollama.ai/library).
 29  
 30  `OllamaChatGenerator`  needs a `model`  name and a `url` to work. By default, it uses `"orca-mini"` model and `"http://localhost:11434"` url.
 31  
 32  The way to operate with `OllamaChatGenerator` is by using  `ChatMessage` objects. [ChatMessage](../../concepts/data-classes/chatmessage.mdx)  is a data class that contains a message, a role (who generated the message, such as `user`, `assistant`, `system`, `function`), and optional metadata. See the [usage](#usage) section for an example.
 33  
 34  ### Tool Support
 35  
 36  `OllamaChatGenerator` supports function calling through the `tools` parameter, which accepts flexible tool configurations:
 37  
 38  - **A list of Tool objects**: Pass individual tools as a list
 39  - **A single Toolset**: Pass an entire Toolset directly
 40  - **Mixed Tools and Toolsets**: Combine multiple Toolsets with standalone tools in a single list
 41  
 42  This allows you to organize related tools into logical groups while also including standalone tools as needed.
 43  
 44  ```python
 45  from haystack.tools import Tool, Toolset
 46  from haystack_integrations.components.generators.ollama import OllamaChatGenerator
 47  
 48  # Create individual tools
 49  weather_tool = Tool(name="weather", description="Get weather info", ...)
 50  news_tool = Tool(name="news", description="Get latest news", ...)
 51  
 52  # Group related tools into a toolset
 53  math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])
 54  
 55  # Pass mixed tools and toolsets to the generator
 56  generator = OllamaChatGenerator(
 57      model="llama2",
 58      tools=[math_toolset, weather_tool, news_tool]  # Mix of Toolset and Tool objects
 59  )
 60  ```
 61  
 62  For more details on working with tools, see the [Tool](../../tools/tool.mdx) and [Toolset](../../tools/toolset.mdx) documentation.
 63  
 64  ### Streaming
 65  
 66  You can stream output as it’s generated. Pass a callback to `streaming_callback`. Use the built-in `print_streaming_chunk` to print text tokens and tool events (tool calls and tool results).
 67  
 68  ```python
 69  from haystack.components.generators.utils import print_streaming_chunk
 70  
 71  ## Configure any `Generator` or `ChatGenerator` with a streaming callback
 72  component = SomeGeneratorOrChatGenerator(streaming_callback=print_streaming_chunk)
 73  
 74  ## If this is a `ChatGenerator`, pass a list of messages:
 75  ## from haystack.dataclasses import ChatMessage
 76  ## component.run([ChatMessage.from_user("Your question here")])
 77  
 78  ## If this is a (non-chat) `Generator`, pass a prompt:
 79  ## component.run({"prompt": "Your prompt here"})
 80  ```
 81  
 82  :::info
 83  Streaming works only with a single response. If a provider supports multiple candidates, set `n=1`.
 84  :::
 85  
 86  See our [Streaming Support](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) docs to learn more how `StreamingChunk` works and how to write a custom callback.
 87  
 88  Give preference to `print_streaming_chunk` by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.
 89  
 90  ## Usage
 91  
 92  1. You need a running instance of Ollama. The installation instructions are [in the Ollama GitHub repository](https://github.com/jmorganca/ollama).
 93     A fast way to run Ollama is using Docker:
 94  
 95  ```bash
 96  docker run -d -p 11434:11434 --name ollama ollama/ollama:latest
 97  ```
 98  
 99  2. You need to download or pull the desired LLM. The model library is available on the [Ollama website](https://ollama.ai/library).
100     If you are using Docker, you can, for example, pull the Zephyr model:
101  
102  ```bash
103  docker exec ollama ollama pull zephyr
104  ```
105  
106  If you already installed Ollama in your system, you can execute:
107  
108  ```bash
109  ollama pull zephyr
110  ```
111  
112  :::tip[Choose a specific version of a model]
113  
114  You can also specify a tag to choose a specific (quantized) version of your model. The available tags are shown in the model card of the Ollama models library. This is an [example](https://ollama.ai/library/zephyr/tags) for Zephyr.
115  In this case, simply run
116  
117  ```shell
118  # ollama pull model:tag
119  ollama pull zephyr:7b-alpha-q3_K_S
120  ```
121  :::
122  
123  3. You also need to install the `ollama-haystack` package:
124  
125  ```bash
126  pip install ollama-haystack
127  ```
128  
129  ### On its own
130  
131  ```python
132  from haystack_integrations.components.generators.ollama import OllamaChatGenerator
133  from haystack.dataclasses import ChatMessage
134  
135  generator = OllamaChatGenerator(model="zephyr",
136                              url = "http://localhost:11434",
137                              generation_kwargs={
138                                "num_predict": 100,
139                                "temperature": 0.9,
140                                })
141  
142  messages = [ChatMessage.from_system("\nYou are a helpful, respectful and honest assistant"),
143  ChatMessage.from_user("What's Natural Language Processing?")]
144  
145  print(generator.run(messages=messages))
146  >> {
147      "replies": [
148          ChatMessage(
149              _role=<ChatRole.ASSISTANT: 'assistant'>,
150              _content=[
151                  TextContent(
152                      text=(
153                          "Natural Language Processing (NLP) is a subfield of "
154                          "Artificial Intelligence that deals with understanding, "
155                          "interpreting, and generating human language in a meaningful "
156                          "way. It enables tasks such as language translation, sentiment "
157                          "analysis, and text summarization."
158                      )
159                  )
160              ],
161              _name=None,
162              _meta={
163                  "model": "zephyr",...
164              }
165          )
166      ]
167  }
168  ```
169  
170  With multimodal inputs:
171  
172  ```python
173  from haystack.dataclasses import ChatMessage, ImageContent
174  from haystack_integrations.components.generators.ollama import OllamaChatGenerator
175  
176  llm = OllamaChatGenerator(model="llava", url="http://localhost:11434")
177  
178  image = ImageContent.from_file_path("apple.jpg")
179  user_message = ChatMessage.from_user(
180      content_parts=["What does the image show? Max 5 words.", image],
181  )
182  
183  response = llm.run([user_message])["replies"][0].text
184  print(response)
185  
186  # Red apple on straw.
187  ```
188  
189  ### In a Pipeline
190  
191  ```python
192  from haystack.components.builders import ChatPromptBuilder
193  from haystack_integrations.components.generators.ollama import OllamaChatGenerator
194  from haystack.dataclasses import ChatMessage
195  from haystack import Pipeline
196  
197  ## no parameter init, we don't use any runtime template variables
198  prompt_builder = ChatPromptBuilder()
199  generator = OllamaChatGenerator(model="zephyr",
200                              url = "http://localhost:11434",
201                              generation_kwargs={
202                                "temperature": 0.9,
203                                })
204  
205  pipe = Pipeline()
206  pipe.add_component("prompt_builder", prompt_builder)
207  pipe.add_component("llm", generator)
208  pipe.connect("prompt_builder.prompt", "llm.messages")
209  location = "Berlin"
210  messages = [ChatMessage.from_system("Always respond in Spanish even if some input data is in other languages."),
211              ChatMessage.from_user("Tell me about {{location}}")]
212  print(pipe.run(data={"prompt_builder": {"template_variables":{"location": location}, "template": messages}}))
213  
214  >> {
215      "llm": {
216          "replies": [
217              ChatMessage(
218                  _role=<ChatRole.ASSISTANT: 'assistant'>,
219                  _content=[
220                      TextContent(
221                          text=(
222                              "Berlín es la capital y la mayor ciudad de Alemania. "
223                              "Está ubicada en el estado federado de Berlín, y tiene más..."
224                          )
225                      )
226                  ],
227                  _name=None,
228                  _meta={
229                      "model": "zephyr",...
230                  }
231              )
232          ]
233      }
234  }
235  ```