Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.21 / haystack-api / builders_api.md
builders_api.md
  1  ---
  2  title: "Builders"
  3  id: builders-api
  4  description: "Extract the output of a Generator to an Answer format, and build prompts."
  5  slug: "/builders-api"
  6  ---
  7  
  8  <a id="answer_builder"></a>
  9  
 10  ## Module answer\_builder
 11  
 12  <a id="answer_builder.AnswerBuilder"></a>
 13  
 14  ### AnswerBuilder
 15  
 16  Converts a query and Generator replies into a `GeneratedAnswer` object.
 17  
 18  AnswerBuilder parses Generator replies using custom regular expressions.
 19  Check out the usage example below to see how it works.
 20  Optionally, it can also take documents and metadata from the Generator to add to the `GeneratedAnswer` object.
 21  AnswerBuilder works with both non-chat and chat Generators.
 22  
 23  ### Usage example
 24  
 25  
 26  ### Usage example with documents and reference pattern
 27  
 28  ```python
 29  from haystack.components.builders import AnswerBuilder
 30  
 31  builder = AnswerBuilder(pattern="Answer: (.*)")
 32  builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."])
 33  ```
 34  ```python
 35  from haystack import Document
 36  from haystack.components.builders import AnswerBuilder
 37  
 38  replies = ["The capital of France is Paris [2]."]
 39  
 40  docs = [
 41      Document(content="Berlin is the capital of Germany."),
 42      Document(content="Paris is the capital of France."),
 43      Document(content="Rome is the capital of Italy."),
 44  ]
 45  
 46  builder = AnswerBuilder(reference_pattern="\[(\d+)\]", return_only_referenced_documents=False)
 47  result = builder.run(query="What is the capital of France?", replies=replies, documents=docs)["answers"][0]
 48  
 49  print(f"Answer: {result.data}")
 50  print("References:")
 51  for doc in result.documents:
 52      if doc.meta["referenced"]:
 53          print(f"[{doc.meta['source_index']}] {doc.content}")
 54  print("Other sources:")
 55  for doc in result.documents:
 56      if not doc.meta["referenced"]:
 57          print(f"[{doc.meta['source_index']}] {doc.content}")
 58  
 59  # Answer: The capital of France is Paris
 60  # References:
 61  # [2] Paris is the capital of France.
 62  # Other sources:
 63  # [1] Berlin is the capital of Germany.
 64  # [3] Rome is the capital of Italy.
 65  ```
 66  
 67  <a id="answer_builder.AnswerBuilder.__init__"></a>
 68  
 69  #### AnswerBuilder.\_\_init\_\_
 70  
 71  ```python
 72  def __init__(pattern: Optional[str] = None,
 73               reference_pattern: Optional[str] = None,
 74               last_message_only: bool = False,
 75               *,
 76               return_only_referenced_documents: bool = True)
 77  ```
 78  
 79  Creates an instance of the AnswerBuilder component.
 80  
 81  **Arguments**:
 82  
 83  - `pattern`: The regular expression pattern to extract the answer text from the Generator.
 84  If not specified, the entire response is used as the answer.
 85  The regular expression can have one capture group at most.
 86  If present, the capture group text
 87  is used as the answer. If no capture group is present, the whole match is used as the answer.
 88  Examples:
 89      `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer".
 90      `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer".
 91  - `reference_pattern`: The regular expression pattern used for parsing the document references.
 92  If not specified, no parsing is done, and all documents are returned.
 93  References need to be specified as indices of the input documents and start at [1].
 94  Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".
 95  If this parameter is provided, documents metadata will contain a "referenced" key with a boolean value.
 96  - `last_message_only`: If False (default value), all messages are used as the answer.
 97  If True, only the last message is used as the answer.
 98  - `return_only_referenced_documents`: To be used in conjunction with `reference_pattern`.
 99  If True (default value), only the documents that were actually referenced in `replies` are returned.
100  If False, all documents are returned.
101  If `reference_pattern` is not provided, this parameter has no effect, and all documents are returned.
102  
103  <a id="answer_builder.AnswerBuilder.run"></a>
104  
105  #### AnswerBuilder.run
106  
107  ```python
108  @component.output_types(answers=list[GeneratedAnswer])
109  def run(query: str,
110          replies: Union[list[str], list[ChatMessage]],
111          meta: Optional[list[dict[str, Any]]] = None,
112          documents: Optional[list[Document]] = None,
113          pattern: Optional[str] = None,
114          reference_pattern: Optional[str] = None)
115  ```
116  
117  Turns the output of a Generator into `GeneratedAnswer` objects using regular expressions.
118  
119  **Arguments**:
120  
121  - `query`: The input query used as the Generator prompt.
122  - `replies`: The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects.
123  - `meta`: The metadata returned by the Generator. If not specified, the generated answer will contain no metadata.
124  - `documents`: The documents used as the Generator inputs. If specified, they are added to
125  the `GeneratedAnswer` objects.
126  Each Document.meta includes a "source_index" key, representing its 1-based position in the input list.
127  When `reference_pattern` is provided:
128  - "referenced" key is added to the Document.meta, indicating if the document was referenced in the output.
129  - `return_only_referenced_documents` init parameter controls if all or only referenced documents are
130  returned.
131  - `pattern`: The regular expression pattern to extract the answer text from the Generator.
132  If not specified, the entire response is used as the answer.
133  The regular expression can have one capture group at most.
134  If present, the capture group text
135  is used as the answer. If no capture group is present, the whole match is used as the answer.
136      Examples:
137          `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer".
138          `Answer: (.*)` finds "this is an answer" in a string
139          "this is an argument. Answer: this is an answer".
140  - `reference_pattern`: The regular expression pattern used for parsing the document references.
141  If not specified, no parsing is done, and all documents are returned.
142  References need to be specified as indices of the input documents and start at [1].
143  Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]".
144  
145  **Returns**:
146  
147  A dictionary with the following keys:
148  - `answers`: The answers received from the output of the Generator.
149  
150  <a id="prompt_builder"></a>
151  
152  ## Module prompt\_builder
153  
154  <a id="prompt_builder.PromptBuilder"></a>
155  
156  ### PromptBuilder
157  
158  Renders a prompt filling in any variables so that it can send it to a Generator.
159  
160  The prompt uses Jinja2 template syntax.
161  The variables in the default template are used as PromptBuilder's input and are all optional.
162  If they're not provided, they're replaced with an empty string in the rendered prompt.
163  To try out different prompts, you can replace the prompt template at runtime by
164  providing a template for each pipeline run invocation.
165  
166  ### Usage examples
167  
168  #### On its own
169  
170  This example uses PromptBuilder to render a prompt template and fill it with `target_language`
171  and `snippet`. PromptBuilder returns a prompt with the string "Translate the following context to Spanish.
172  Context: I can't speak Spanish.; Translation:".
173  ```python
174  from haystack.components.builders import PromptBuilder
175  
176  template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:"
177  builder = PromptBuilder(template=template)
178  builder.run(target_language="spanish", snippet="I can't speak spanish.")
179  ```
180  
181  #### In a Pipeline
182  
183  This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it
184  with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator.
185  ```python
186  from haystack import Pipeline, Document
187  from haystack.utils import Secret
188  from haystack.components.generators import OpenAIGenerator
189  from haystack.components.builders.prompt_builder import PromptBuilder
190  
191  # in a real world use case documents could come from a retriever, web, or any other source
192  documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")]
193  prompt_template = """
194      Given these documents, answer the question.
195      Documents:
196      {% for doc in documents %}
197          {{ doc.content }}
198      {% endfor %}
199  
200      Question: {{query}}
201      Answer:
202      """
203  p = Pipeline()
204  p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder")
205  p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm")
206  p.connect("prompt_builder", "llm")
207  
208  question = "Where does Joe live?"
209  result = p.run({"prompt_builder": {"documents": documents, "query": question}})
210  print(result)
211  ```
212  
213  #### Changing the template at runtime (prompt engineering)
214  
215  You can change the prompt template of an existing pipeline, like in this example:
216  ```python
217  documents = [
218      Document(content="Joe lives in Berlin", meta={"name": "doc1"}),
219      Document(content="Joe is a software engineer", meta={"name": "doc1"}),
220  ]
221  new_template = """
222      You are a helpful assistant.
223      Given these documents, answer the question.
224      Documents:
225      {% for doc in documents %}
226          Document {{ loop.index }}:
227          Document name: {{ doc.meta['name'] }}
228          {{ doc.content }}
229      {% endfor %}
230  
231      Question: {{ query }}
232      Answer:
233      """
234  p.run({
235      "prompt_builder": {
236          "documents": documents,
237          "query": question,
238          "template": new_template,
239      },
240  })
241  ```
242  To replace the variables in the default template when testing your prompt,
243  pass the new variables in the `variables` parameter.
244  
245  #### Overwriting variables at runtime
246  
247  To overwrite the values of variables, use `template_variables` during runtime:
248  ```python
249  language_template = """
250  You are a helpful assistant.
251  Given these documents, answer the question.
252  Documents:
253  {% for doc in documents %}
254      Document {{ loop.index }}:
255      Document name: {{ doc.meta['name'] }}
256      {{ doc.content }}
257  {% endfor %}
258  
259  Question: {{ query }}
260  Please provide your answer in {{ answer_language | default('English') }}
261  Answer:
262  """
263  p.run({
264      "prompt_builder": {
265          "documents": documents,
266          "query": question,
267          "template": language_template,
268          "template_variables": {"answer_language": "German"},
269      },
270  })
271  ```
272  Note that `language_template` introduces variable `answer_language` which is not bound to any pipeline variable.
273  If not set otherwise, it will use its default value 'English'.
274  This example overwrites its value to 'German'.
275  Use `template_variables` to overwrite pipeline variables (such as documents) as well.
276  
277  <a id="prompt_builder.PromptBuilder.__init__"></a>
278  
279  #### PromptBuilder.\_\_init\_\_
280  
281  ```python
282  def __init__(template: str,
283               required_variables: Optional[Union[list[str],
284                                                  Literal["*"]]] = None,
285               variables: Optional[list[str]] = None)
286  ```
287  
288  Constructs a PromptBuilder component.
289  
290  **Arguments**:
291  
292  - `template`: A prompt template that uses Jinja2 syntax to add variables. For example:
293  `"Summarize this document: {{ documents[0].content }}\nSummary:"`
294  It's used to render the prompt.
295  The variables in the default template are input for PromptBuilder and are all optional,
296  unless explicitly specified.
297  If an optional variable is not provided, it's replaced with an empty string in the rendered prompt.
298  - `required_variables`: List variables that must be provided as input to PromptBuilder.
299  If a variable listed as required is not provided, an exception is raised.
300  If set to "*", all variables found in the prompt are required. Optional.
301  - `variables`: List input variables to use in prompt templates instead of the ones inferred from the
302  `template` parameter. For example, to use more variables during prompt engineering than the ones present
303  in the default template, you can provide them here.
304  
305  <a id="prompt_builder.PromptBuilder.to_dict"></a>
306  
307  #### PromptBuilder.to\_dict
308  
309  ```python
310  def to_dict() -> dict[str, Any]
311  ```
312  
313  Returns a dictionary representation of the component.
314  
315  **Returns**:
316  
317  Serialized dictionary representation of the component.
318  
319  <a id="prompt_builder.PromptBuilder.run"></a>
320  
321  #### PromptBuilder.run
322  
323  ```python
324  @component.output_types(prompt=str)
325  def run(template: Optional[str] = None,
326          template_variables: Optional[dict[str, Any]] = None,
327          **kwargs)
328  ```
329  
330  Renders the prompt template with the provided variables.
331  
332  It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs.
333  In order to overwrite the default template, you can set the `template` parameter.
334  In order to overwrite pipeline kwargs, you can set the `template_variables` parameter.
335  
336  **Arguments**:
337  
338  - `template`: An optional string template to overwrite PromptBuilder's default template. If None, the default template
339  provided at initialization is used.
340  - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables.
341  - `kwargs`: Pipeline variables used for rendering the prompt.
342  
343  **Raises**:
344  
345  - `ValueError`: If any of the required template variables is not provided.
346  
347  **Returns**:
348  
349  A dictionary with the following keys:
350  - `prompt`: The updated prompt text after rendering the prompt template.
351  
352  <a id="chat_prompt_builder"></a>
353  
354  ## Module chat\_prompt\_builder
355  
356  <a id="chat_prompt_builder.ChatPromptBuilder"></a>
357  
358  ### ChatPromptBuilder
359  
360  Renders a chat prompt from a template using Jinja2 syntax.
361  
362  A template can be a list of `ChatMessage` objects, or a special string, as shown in the usage examples.
363  
364  It constructs prompts using static or dynamic templates, which you can update for each pipeline run.
365  
366  Template variables in the template are optional unless specified otherwise.
367  If an optional variable isn't provided, it defaults to an empty string. Use `variable` and `required_variables`
368  to define input types and required variables.
369  
370  ### Usage examples
371  
372  #### Static ChatMessage prompt template
373  
374  ```python
375  template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")]
376  builder = ChatPromptBuilder(template=template)
377  builder.run(target_language="spanish", snippet="I can't speak spanish.")
378  ```
379  
380  #### Overriding static ChatMessage template at runtime
381  
382  ```python
383  template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")]
384  builder = ChatPromptBuilder(template=template)
385  builder.run(target_language="spanish", snippet="I can't speak spanish.")
386  
387  msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:"
388  summary_template = [ChatMessage.from_user(msg)]
389  builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template)
390  ```
391  
392  #### Dynamic ChatMessage prompt template
393  
394  ```python
395  from haystack.components.builders import ChatPromptBuilder
396  from haystack.components.generators.chat import OpenAIChatGenerator
397  from haystack.dataclasses import ChatMessage
398  from haystack import Pipeline
399  from haystack.utils import Secret
400  
401  # no parameter init, we don't use any runtime template variables
402  prompt_builder = ChatPromptBuilder()
403  llm = OpenAIChatGenerator(api_key=Secret.from_token("<your-api-key>"))
404  
405  pipe = Pipeline()
406  pipe.add_component("prompt_builder", prompt_builder)
407  pipe.add_component("llm", llm)
408  pipe.connect("prompt_builder.prompt", "llm.messages")
409  
410  location = "Berlin"
411  language = "English"
412  system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}")
413  messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")]
414  
415  res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language},
416                                      "template": messages}})
417  print(res)
418  
419  >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=
420  "Berlin is the capital city of Germany and one of the most vibrant
421  and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic
422  capital of Germany!")], _name=None, _meta={'model': 'gpt-5-mini',
423  'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens':
424  708}})]}}
425  
426  messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next
427  {{day_count}} days?")]
428  
429  res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"},
430                                      "template": messages}})
431  
432  print(res)
433  >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=
434  "Here is the weather forecast for Berlin in the next 5
435  days:\n\nDay 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates
436  closer to your visit.")], _name=None, _meta={'model': 'gpt-5-mini',
437  'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201,
438  'total_tokens': 238}})]}}
439  ```
440  
441  #### String prompt template
442  ```python
443  from haystack.components.builders import ChatPromptBuilder
444  from haystack.dataclasses.image_content import ImageContent
445  
446  template = """
447  {% message role="system" %}
448  You are a helpful assistant.
449  {% endmessage %}
450  
451  {% message role="user" %}
452  Hello! I am {{user_name}}. What's the difference between the following images?
453  {% for image in images %}
454  {{ image | templatize_part }}
455  {% endfor %}
456  {% endmessage %}
457  """
458  
459  images = [ImageContent.from_file_path("apple.jpg"), ImageContent.from_file_path("orange.jpg")]
460  
461  builder = ChatPromptBuilder(template=template)
462  builder.run(user_name="John", images=images)
463  ```
464  
465  <a id="chat_prompt_builder.ChatPromptBuilder.__init__"></a>
466  
467  #### ChatPromptBuilder.\_\_init\_\_
468  
469  ```python
470  def __init__(template: Optional[Union[list[ChatMessage], str]] = None,
471               required_variables: Optional[Union[list[str],
472                                                  Literal["*"]]] = None,
473               variables: Optional[list[str]] = None)
474  ```
475  
476  Constructs a ChatPromptBuilder component.
477  
478  **Arguments**:
479  
480  - `template`: A list of `ChatMessage` objects or a string template. The component looks for Jinja2 template syntax and
481  renders the prompt with the provided variables. Provide the template in either
482  the `init` method` or the `run` method.
483  - `required_variables`: List variables that must be provided as input to ChatPromptBuilder.
484  If a variable listed as required is not provided, an exception is raised.
485  If set to "*", all variables found in the prompt are required. Optional.
486  - `variables`: List input variables to use in prompt templates instead of the ones inferred from the
487  `template` parameter. For example, to use more variables during prompt engineering than the ones present
488  in the default template, you can provide them here.
489  
490  <a id="chat_prompt_builder.ChatPromptBuilder.run"></a>
491  
492  #### ChatPromptBuilder.run
493  
494  ```python
495  @component.output_types(prompt=list[ChatMessage])
496  def run(template: Optional[Union[list[ChatMessage], str]] = None,
497          template_variables: Optional[dict[str, Any]] = None,
498          **kwargs)
499  ```
500  
501  Renders the prompt template with the provided variables.
502  
503  It applies the template variables to render the final prompt. You can provide variables with pipeline kwargs.
504  To overwrite the default template, you can set the `template` parameter.
505  To overwrite pipeline kwargs, you can set the `template_variables` parameter.
506  
507  **Arguments**:
508  
509  - `template`: An optional list of `ChatMessage` objects or string template to overwrite ChatPromptBuilder's default
510  template.
511  If `None`, the default template provided at initialization is used.
512  - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables.
513  - `kwargs`: Pipeline variables used for rendering the prompt.
514  
515  **Raises**:
516  
517  - `ValueError`: If `chat_messages` is empty or contains elements that are not instances of `ChatMessage`.
518  
519  **Returns**:
520  
521  A dictionary with the following keys:
522  - `prompt`: The updated list of `ChatMessage` objects after rendering the templates.
523  
524  <a id="chat_prompt_builder.ChatPromptBuilder.to_dict"></a>
525  
526  #### ChatPromptBuilder.to\_dict
527  
528  ```python
529  def to_dict() -> dict[str, Any]
530  ```
531  
532  Returns a dictionary representation of the component.
533  
534  **Returns**:
535  
536  Serialized dictionary representation of the component.
537  
538  <a id="chat_prompt_builder.ChatPromptBuilder.from_dict"></a>
539  
540  #### ChatPromptBuilder.from\_dict
541  
542  ```python
543  @classmethod
544  def from_dict(cls, data: dict[str, Any]) -> "ChatPromptBuilder"
545  ```
546  
547  Deserialize this component from a dictionary.
548  
549  **Arguments**:
550  
551  - `data`: The dictionary to deserialize and create the component.
552  
553  **Returns**:
554  
555  The deserialized component.
556