Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.22 / haystack-api / builders_api.md
builders_api.md
  1  ---
  2  title: "Builders"
  3  id: builders-api
  4  description: "Extract the output of a Generator to an Answer format, and build prompts."
  5  slug: "/builders-api"
  6  ---
  7  
  8  <a id="answer_builder"></a>
  9  
 10  ## Module answer\_builder
 11  
 12  <a id="answer_builder.AnswerBuilder"></a>
 13  
 14  ### AnswerBuilder
 15  
 16  Converts a query and Generator replies into a `GeneratedAnswer` object.
 17  
 18  AnswerBuilder parses Generator replies using custom regular expressions.
 19  Check out the usage example below to see how it works.
 20  Optionally, it can also take documents and metadata from the Generator to add to the `GeneratedAnswer` object.
 21  AnswerBuilder works with both non-chat and chat Generators.
 22  
 23  ### Usage example
 24  
 25  
 26  ### Usage example with documents and reference pattern
 27  
 28  ```python
 29  from haystack.components.builders import AnswerBuilder
 30  
 31  builder = AnswerBuilder(pattern="Answer: (.*)")
 32  builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."])
 33  ```
 34  ```python
 35  from haystack import Document
 36  from haystack.components.builders import AnswerBuilder
 37  
 38  replies = ["The capital of France is Paris [2]."]
 39  
 40  docs = [
 41      Document(content="Berlin is the capital of Germany."),
 42      Document(content="Paris is the capital of France."),
 43      Document(content="Rome is the capital of Italy."),
 44  ]
 45  
 46  builder = AnswerBuilder(reference_pattern="\[(\d+)\]", return_only_referenced_documents=False)
 47  result = builder.run(query="What is the capital of France?", replies=replies, documents=docs)["answers"][0]
 48  
 49  print(f"Answer: {result.data}")
 50  print("References:")
 51  for doc in result.documents:
 52      if doc.meta["referenced"]:
 53          print(f"[{doc.meta['source_index']}] {doc.content}")
 54  print("Other sources:")
 55  for doc in result.documents:
 56      if not doc.meta["referenced"]:
 57          print(f"[{doc.meta['source_index']}] {doc.content}")
 58  
 59  # Answer: The capital of France is Paris
 60  # References:
 61  # [2] Paris is the capital of France.
 62  # Other sources:
 63  # [1] Berlin is the capital of Germany.
 64  # [3] Rome is the capital of Italy.
 65  ```
 66  
 67  <a id="answer_builder.AnswerBuilder.__init__"></a>
 68  
 69  #### AnswerBuilder.\_\_init\_\_
 70  
 71  ```python
 72  def __init__(pattern: str | None = None,
 73               reference_pattern: str | None = None,
 74               last_message_only: bool = False,
 75               *,
 76               return_only_referenced_documents: bool = True)
 77  ```
 78  
 79  Creates an instance of the AnswerBuilder component.
 80  
 81  **Arguments**:
 82  
 83  - `pattern`: The regular expression pattern to extract the answer text from the Generator.
 84  If not specified, the entire response is used as the answer.
 85  The regular expression can have one capture group at most.
 86  If present, the capture group text
 87  is used as the answer. If no capture group is present, the whole match is used as the answer.
 88  Examples:
 89      `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer".
 90      `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
 91  - `reference_pattern`: The regular expression pattern used for parsing the document references.
 92  If not specified, no parsing is done, and all documents are returned.
 93  References need to be specified as indices of the input documents and start at [1].
 94  Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".
 95  If this parameter is provided, documents metadata will contain a "referenced" key with a boolean value.
 96  - `last_message_only`: If False (default value), all messages are used as the answer.
 97  If True, only the last message is used as the answer.
 98  - `return_only_referenced_documents`: To be used in conjunction with `reference_pattern`.
 99  If True (default value), only the documents that were actually referenced in `replies` are returned.
100  If False, all documents are returned.
101  If `reference_pattern` is not provided, this parameter has no effect, and all documents are returned.
102  
103  <a id="answer_builder.AnswerBuilder.run"></a>
104  
105  #### AnswerBuilder.run
106  
107  ```python
108  @component.output_types(answers=list[GeneratedAnswer])
109  def run(query: str,
110          replies: list[str] | list[ChatMessage],
111          meta: list[dict[str, Any]] | None = None,
112          documents: list[Document] | None = None,
113          pattern: str | None = None,
114          reference_pattern: str | None = None)
115  ```
116  
117  Turns the output of a Generator into `GeneratedAnswer` objects using regular expressions.
118  
119  **Arguments**:
120  
121  - `query`: The input query used as the Generator prompt.
122  - `replies`: The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects.
123  - `meta`: The metadata returned by the Generator. If not specified, the generated answer will contain no metadata.
124  - `documents`: The documents used as the Generator inputs. If specified, they are added to
125  the `GeneratedAnswer` objects.
126  Each Document.meta includes a "source_index" key, representing its 1-based position in the input list.
127  When `reference_pattern` is provided:
128  - "referenced" key is added to the Document.meta, indicating if the document was referenced in the output.
129  - `return_only_referenced_documents` init parameter controls if all or only referenced documents are
130  returned.
131  - `pattern`: The regular expression pattern to extract the answer text from the Generator.
132  If not specified, the entire response is used as the answer.
133  The regular expression can have one capture group at most.
134  If present, the capture group text
135  is used as the answer. If no capture group is present, the whole match is used as the answer.
136      Examples:
137          `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer".
138          `Answer: (.*)` finds "this is an answer" in a string
139          "this is an argument. Answer: this is an answer".
140  - `reference_pattern`: The regular expression pattern used for parsing the document references.
141  If not specified, no parsing is done, and all documents are returned.
142  References need to be specified as indices of the input documents and start at [1].
143  Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".
144  
145  **Returns**:
146  
147  A dictionary with the following keys:
148  - `answers`: The answers received from the output of the Generator.
149  
150  <a id="chat_prompt_builder"></a>
151  
152  ## Module chat\_prompt\_builder
153  
154  <a id="chat_prompt_builder.ChatPromptBuilder"></a>
155  
156  ### ChatPromptBuilder
157  
158  Renders a chat prompt from a template using Jinja2 syntax.
159  
160  A template can be a list of `ChatMessage` objects, or a special string, as shown in the usage examples.
161  
162  It constructs prompts using static or dynamic templates, which you can update for each pipeline run.
163  
164  Template variables in the template are optional unless specified otherwise.
165  If an optional variable isn't provided, it defaults to an empty string. Use `variable` and `required_variables`
166  to define input types and required variables.
167  
168  ### Usage examples
169  
170  #### Static ChatMessage prompt template
171  
172  ```python
173  template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")]
174  builder = ChatPromptBuilder(template=template)
175  builder.run(target_language="spanish", snippet="I can't speak spanish.")
176  ```
177  
178  #### Overriding static ChatMessage template at runtime
179  
180  ```python
181  template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")]
182  builder = ChatPromptBuilder(template=template)
183  builder.run(target_language="spanish", snippet="I can't speak spanish.")
184  
185  msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:"
186  summary_template = [ChatMessage.from_user(msg)]
187  builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template)
188  ```
189  
190  #### Dynamic ChatMessage prompt template
191  
192  ```python
193  from haystack.components.builders import ChatPromptBuilder
194  from haystack.components.generators.chat import OpenAIChatGenerator
195  from haystack.dataclasses import ChatMessage
196  from haystack import Pipeline
197  
198  # no parameter init, we don't use any runtime template variables
199  prompt_builder = ChatPromptBuilder()
200  llm = OpenAIChatGenerator(model="gpt-5-mini")
201  
202  pipe = Pipeline()
203  pipe.add_component("prompt_builder", prompt_builder)
204  pipe.add_component("llm", llm)
205  pipe.connect("prompt_builder.prompt", "llm.messages")
206  
207  location = "Berlin"
208  language = "English"
209  system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}")
210  messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")]
211  
212  res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language},
213                                      "template": messages}})
214  print(res)
215  # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=
216  # "Berlin is the capital city of Germany and one of the most vibrant
217  # and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic
218  # capital of Germany!")], _name=None, _meta={'model': 'gpt-5-mini',
219  # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens':
220  # 708}})]}}
221  
222  messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next
223  {{day_count}} days?")]
224  
225  res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"},
226                                      "template": messages}})
227  
228  print(res)
229  # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=
230  # "Here is the weather forecast for Berlin in the next 5
231  # days:\n\nDay 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates
232  # closer to your visit.")], _name=None, _meta={'model': 'gpt-5-mini',
233  # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201,
234  # 'total_tokens': 238}})]}}
235  ```
236  
237  #### String prompt template
238  ```python
239  from haystack.components.builders import ChatPromptBuilder
240  from haystack.dataclasses.image_content import ImageContent
241  
242  template = """
243  {% message role="system" %}
244  You are a helpful assistant.
245  {% endmessage %}
246  
247  {% message role="user" %}
248  Hello! I am {{user_name}}. What's the difference between the following images?
249  {% for image in images %}
250  {{ image | templatize_part }}
251  {% endfor %}
252  {% endmessage %}
253  """
254  
255  images = [ImageContent.from_file_path("test/test_files/images/apple.jpg"),
256            ImageContent.from_file_path("test/test_files/images/haystack-logo.png")]
257  
258  builder = ChatPromptBuilder(template=template)
259  builder.run(user_name="John", images=images)
260  ```
261  
262  <a id="chat_prompt_builder.ChatPromptBuilder.__init__"></a>
263  
264  #### ChatPromptBuilder.\_\_init\_\_
265  
266  ```python
267  def __init__(template: list[ChatMessage] | str | None = None,
268               required_variables: list[str] | Literal["*"] | None = None,
269               variables: list[str] | None = None)
270  ```
271  
272  Constructs a ChatPromptBuilder component.
273  
274  **Arguments**:
275  
276  - `template`: A list of `ChatMessage` objects or a string template. The component looks for Jinja2 template syntax and
277  renders the prompt with the provided variables. Provide the template in either
278  the `init` method` or the `run` method.
279  - `required_variables`: List variables that must be provided as input to ChatPromptBuilder.
280  If a variable listed as required is not provided, an exception is raised.
281  If set to "*", all variables found in the prompt are required. Optional.
282  - `variables`: List input variables to use in prompt templates instead of the ones inferred from the
283  `template` parameter. For example, to use more variables during prompt engineering than the ones present
284  in the default template, you can provide them here.
285  
286  <a id="chat_prompt_builder.ChatPromptBuilder.run"></a>
287  
288  #### ChatPromptBuilder.run
289  
290  ```python
291  @component.output_types(prompt=list[ChatMessage])
292  def run(template: list[ChatMessage] | str | None = None,
293          template_variables: dict[str, Any] | None = None,
294          **kwargs)
295  ```
296  
297  Renders the prompt template with the provided variables.
298  
299  It applies the template variables to render the final prompt. You can provide variables with pipeline kwargs.
300  To overwrite the default template, you can set the `template` parameter.
301  To overwrite pipeline kwargs, you can set the `template_variables` parameter.
302  
303  **Arguments**:
304  
305  - `template`: An optional list of `ChatMessage` objects or string template to overwrite ChatPromptBuilder's default
306  template.
307  If `None`, the default template provided at initialization is used.
308  - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables.
309  - `kwargs`: Pipeline variables used for rendering the prompt.
310  
311  **Raises**:
312  
313  - `ValueError`: If `chat_messages` is empty or contains elements that are not instances of `ChatMessage`.
314  
315  **Returns**:
316  
317  A dictionary with the following keys:
318  - `prompt`: The updated list of `ChatMessage` objects after rendering the templates.
319  
320  <a id="chat_prompt_builder.ChatPromptBuilder.to_dict"></a>
321  
322  #### ChatPromptBuilder.to\_dict
323  
324  ```python
325  def to_dict() -> dict[str, Any]
326  ```
327  
328  Returns a dictionary representation of the component.
329  
330  **Returns**:
331  
332  Serialized dictionary representation of the component.
333  
334  <a id="chat_prompt_builder.ChatPromptBuilder.from_dict"></a>
335  
336  #### ChatPromptBuilder.from\_dict
337  
338  ```python
339  @classmethod
340  def from_dict(cls, data: dict[str, Any]) -> "ChatPromptBuilder"
341  ```
342  
343  Deserialize this component from a dictionary.
344  
345  **Arguments**:
346  
347  - `data`: The dictionary to deserialize and create the component.
348  
349  **Returns**:
350  
351  The deserialized component.
352  
353  <a id="prompt_builder"></a>
354  
355  ## Module prompt\_builder
356  
357  <a id="prompt_builder.PromptBuilder"></a>
358  
359  ### PromptBuilder
360  
361  Renders a prompt filling in any variables so that it can send it to a Generator.
362  
363  The prompt uses Jinja2 template syntax.
364  The variables in the default template are used as PromptBuilder's input and are all optional.
365  If they're not provided, they're replaced with an empty string in the rendered prompt.
366  To try out different prompts, you can replace the prompt template at runtime by
367  providing a template for each pipeline run invocation.
368  
369  ### Usage examples
370  
371  #### On its own
372  
373  This example uses PromptBuilder to render a prompt template and fill it with `target_language`
374  and `snippet`. PromptBuilder returns a prompt with the string "Translate the following context to Spanish.
375  Context: I can't speak Spanish.; Translation:".
376  ```python
377  from haystack.components.builders import PromptBuilder
378  
379  template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:"
380  builder = PromptBuilder(template=template)
381  builder.run(target_language="spanish", snippet="I can't speak spanish.")
382  ```
383  
384  #### In a Pipeline
385  
386  This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it
387  with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator.
388  ```python
389  from haystack import Pipeline, Document
390  from haystack.utils import Secret
391  from haystack.components.generators import OpenAIGenerator
392  from haystack.components.builders.prompt_builder import PromptBuilder
393  
394  # in a real world use case documents could come from a retriever, web, or any other source
395  documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")]
396  prompt_template = """
397      Given these documents, answer the question.
398      Documents:
399      {% for doc in documents %}
400          {{ doc.content }}
401      {% endfor %}
402  
403      Question: {{query}}
404      Answer:
405      """
406  p = Pipeline()
407  p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder")
408  p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm")
409  p.connect("prompt_builder", "llm")
410  
411  question = "Where does Joe live?"
412  result = p.run({"prompt_builder": {"documents": documents, "query": question}})
413  print(result)
414  ```
415  
416  #### Changing the template at runtime (prompt engineering)
417  
418  You can change the prompt template of an existing pipeline, like in this example:
419  ```python
420  documents = [
421      Document(content="Joe lives in Berlin", meta={"name": "doc1"}),
422      Document(content="Joe is a software engineer", meta={"name": "doc1"}),
423  ]
424  new_template = """
425      You are a helpful assistant.
426      Given these documents, answer the question.
427      Documents:
428      {% for doc in documents %}
429          Document {{ loop.index }}:
430          Document name: {{ doc.meta['name'] }}
431          {{ doc.content }}
432      {% endfor %}
433  
434      Question: {{ query }}
435      Answer:
436      """
437  p.run({
438      "prompt_builder": {
439          "documents": documents,
440          "query": question,
441          "template": new_template,
442      },
443  })
444  ```
445  To replace the variables in the default template when testing your prompt,
446  pass the new variables in the `variables` parameter.
447  
448  #### Overwriting variables at runtime
449  
450  To overwrite the values of variables, use `template_variables` during runtime:
451  ```python
452  language_template = """
453  You are a helpful assistant.
454  Given these documents, answer the question.
455  Documents:
456  {% for doc in documents %}
457      Document {{ loop.index }}:
458      Document name: {{ doc.meta['name'] }}
459      {{ doc.content }}
460  {% endfor %}
461  
462  Question: {{ query }}
463  Please provide your answer in {{ answer_language | default('English') }}
464  Answer:
465  """
466  p.run({
467      "prompt_builder": {
468          "documents": documents,
469          "query": question,
470          "template": language_template,
471          "template_variables": {"answer_language": "German"},
472      },
473  })
474  ```
475  Note that `language_template` introduces variable `answer_language` which is not bound to any pipeline variable.
476  If not set otherwise, it will use its default value 'English'.
477  This example overwrites its value to 'German'.
478  Use `template_variables` to overwrite pipeline variables (such as documents) as well.
479  
480  <a id="prompt_builder.PromptBuilder.__init__"></a>
481  
482  #### PromptBuilder.\_\_init\_\_
483  
484  ```python
485  def __init__(template: str,
486               required_variables: list[str] | Literal["*"] | None = None,
487               variables: list[str] | None = None)
488  ```
489  
490  Constructs a PromptBuilder component.
491  
492  **Arguments**:
493  
494  - `template`: A prompt template that uses Jinja2 syntax to add variables. For example:
495  `"Summarize this document: {{ documents[0].content }}\nSummary:"`
496  It's used to render the prompt.
497  The variables in the default template are input for PromptBuilder and are all optional,
498  unless explicitly specified.
499  If an optional variable is not provided, it's replaced with an empty string in the rendered prompt.
500  - `required_variables`: List variables that must be provided as input to PromptBuilder.
501  If a variable listed as required is not provided, an exception is raised.
502  If set to "*", all variables found in the prompt are required. Optional.
503  - `variables`: List input variables to use in prompt templates instead of the ones inferred from the
504  `template` parameter. For example, to use more variables during prompt engineering than the ones present
505  in the default template, you can provide them here.
506  
507  <a id="prompt_builder.PromptBuilder.to_dict"></a>
508  
509  #### PromptBuilder.to\_dict
510  
511  ```python
512  def to_dict() -> dict[str, Any]
513  ```
514  
515  Returns a dictionary representation of the component.
516  
517  **Returns**:
518  
519  Serialized dictionary representation of the component.
520  
521  <a id="prompt_builder.PromptBuilder.run"></a>
522  
523  #### PromptBuilder.run
524  
525  ```python
526  @component.output_types(prompt=str)
527  def run(template: str | None = None,
528          template_variables: dict[str, Any] | None = None,
529          **kwargs)
530  ```
531  
532  Renders the prompt template with the provided variables.
533  
534  It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs.
535  In order to overwrite the default template, you can set the `template` parameter.
536  In order to overwrite pipeline kwargs, you can set the `template_variables` parameter.
537  
538  **Arguments**:
539  
540  - `template`: An optional string template to overwrite PromptBuilder's default template. If None, the default template
541  provided at initialization is used.
542  - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables.
543  - `kwargs`: Pipeline variables used for rendering the prompt.
544  
545  **Raises**:
546  
547  - `ValueError`: If any of the required template variables is not provided.
548  
549  **Returns**:
550  
551  A dictionary with the following keys:
552  - `prompt`: The updated prompt text after rendering the prompt template.
553