builders_api.md
1 --- 2 title: "Builders" 3 id: builders-api 4 description: "Extract the output of a Generator to an Answer format, and build prompts." 5 slug: "/builders-api" 6 --- 7 8 9 ## answer_builder 10 11 ### AnswerBuilder 12 13 Converts a query and Generator replies into a `GeneratedAnswer` object. 14 15 AnswerBuilder parses Generator replies using custom regular expressions. 16 Check out the usage example below to see how it works. 17 Optionally, it can also take documents and metadata from the Generator to add to the `GeneratedAnswer` object. 18 AnswerBuilder works with both non-chat and chat Generators. 19 20 ### Usage example 21 22 ```python 23 from haystack.components.builders import AnswerBuilder 24 25 builder = AnswerBuilder(pattern="Answer: (.*)") 26 builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."]) 27 ``` 28 29 ### Usage example with documents and reference pattern 30 31 ```python 32 from haystack import Document 33 from haystack.components.builders import AnswerBuilder 34 35 replies = ["The capital of France is Paris [2]."] 36 37 docs = [ 38 Document(content="Berlin is the capital of Germany."), 39 Document(content="Paris is the capital of France."), 40 Document(content="Rome is the capital of Italy."), 41 ] 42 43 builder = AnswerBuilder(reference_pattern="\[(\d+)\]", return_only_referenced_documents=False) 44 result = builder.run(query="What is the capital of France?", replies=replies, documents=docs)["answers"][0] 45 46 print(f"Answer: {result.data}") 47 print("References:") 48 for doc in result.documents: 49 if doc.meta["referenced"]: 50 print(f"[{doc.meta['source_index']}] {doc.content}") 51 print("Other sources:") 52 for doc in result.documents: 53 if not doc.meta["referenced"]: 54 print(f"[{doc.meta['source_index']}] {doc.content}") 55 56 # >> Answer: The capital of France is Paris 57 # >> References: 58 # >> [2] Paris is the capital of France. 59 # >> Other sources: 60 # >> [1] Berlin is the capital of Germany. 61 # >> [3] Rome is the capital of Italy. 62 ``` 63 64 #### __init__ 65 66 ```python 67 __init__( 68 pattern: str | None = None, 69 reference_pattern: str | None = None, 70 last_message_only: bool = False, 71 *, 72 return_only_referenced_documents: bool = True 73 ) -> None 74 ``` 75 76 Creates an instance of the AnswerBuilder component. 77 78 **Parameters:** 79 80 - **pattern** (<code>str | None</code>) – The regular expression pattern to extract the answer text from the Generator. 81 If not specified, the entire response is used as the answer. 82 The regular expression can have one capture group at most. 83 If present, the capture group text 84 is used as the answer. If no capture group is present, the whole match is used as the answer. 85 Examples: 86 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\\nthis is an answer". 87 `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer". 88 - **reference_pattern** (<code>str | None</code>) – The regular expression pattern used for parsing the document references. 89 If not specified, no parsing is done, and all documents are returned. 90 References need to be specified as indices of the input documents and start at [1]. 91 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 92 If this parameter is provided, documents metadata will contain a "referenced" key with a boolean value. 93 - **last_message_only** (<code>bool</code>) – If False (default value), all messages are used as the answer. 94 If True, only the last message is used as the answer. 95 - **return_only_referenced_documents** (<code>bool</code>) – To be used in conjunction with `reference_pattern`. 96 If True (default value), only the documents that were actually referenced in `replies` are returned. 97 If False, all documents are returned. 98 If `reference_pattern` is not provided, this parameter has no effect, and all documents are returned. 99 100 #### run 101 102 ```python 103 run( 104 query: str, 105 replies: list[str] | list[ChatMessage], 106 meta: list[dict[str, Any]] | None = None, 107 documents: list[Document] | None = None, 108 pattern: str | None = None, 109 reference_pattern: str | None = None, 110 ) -> dict[str, Any] 111 ``` 112 113 Turns the output of a Generator into `GeneratedAnswer` objects using regular expressions. 114 115 **Parameters:** 116 117 - **query** (<code>str</code>) – The input query used as the Generator prompt. 118 - **replies** (<code>list\[str\] | list\[ChatMessage\]</code>) – The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects. 119 - **meta** (<code>list\[dict\[str, Any\]\] | None</code>) – The metadata returned by the Generator. If not specified, the generated answer will contain no metadata. 120 - **documents** (<code>list\[Document\] | None</code>) – The documents used as the Generator inputs. If specified, they are added to 121 the `GeneratedAnswer` objects. 122 Each Document.meta includes a "source_index" key, representing its 1-based position in the input list. 123 When `reference_pattern` is provided: 124 - "referenced" key is added to the Document.meta, indicating if the document was referenced in the output. 125 - `return_only_referenced_documents` init parameter controls if all or only referenced documents are 126 returned. 127 - **pattern** (<code>str | None</code>) – The regular expression pattern to extract the answer text from the Generator. 128 If not specified, the entire response is used as the answer. 129 The regular expression can have one capture group at most. 130 If present, the capture group text 131 is used as the answer. If no capture group is present, the whole match is used as the answer. 132 Examples: 133 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\\nthis is an answer". 134 `Answer: (.*)` finds "this is an answer" in a string 135 "this is an argument. Answer: this is an answer". 136 - **reference_pattern** (<code>str | None</code>) – The regular expression pattern used for parsing the document references. 137 If not specified, no parsing is done, and all documents are returned. 138 References need to be specified as indices of the input documents and start at [1]. 139 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 140 141 **Returns:** 142 143 - <code>dict\[str, Any\]</code> – A dictionary with the following keys: 144 - `answers`: The answers received from the output of the Generator. 145 146 ## chat_prompt_builder 147 148 ### ChatPromptBuilder 149 150 Renders a chat prompt from a template using Jinja2 syntax. 151 152 A template can be a list of `ChatMessage` objects, or a special string, as shown in the usage examples. 153 154 It constructs prompts using static or dynamic templates, which you can update for each pipeline run. 155 156 Template variables in the template are optional unless specified otherwise. 157 If an optional variable isn't provided, it defaults to an empty string. Use `variable` and `required_variables` 158 to define input types and required variables. 159 160 ### Usage examples 161 162 #### Static ChatMessage prompt template 163 164 ```python 165 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 166 builder = ChatPromptBuilder(template=template) 167 builder.run(target_language="spanish", snippet="I can't speak spanish.") 168 ``` 169 170 #### Overriding static ChatMessage template at runtime 171 172 ```python 173 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 174 builder = ChatPromptBuilder(template=template) 175 builder.run(target_language="spanish", snippet="I can't speak spanish.") 176 177 msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:" 178 summary_template = [ChatMessage.from_user(msg)] 179 builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template) 180 ``` 181 182 #### Dynamic ChatMessage prompt template 183 184 ```python 185 from haystack.components.builders import ChatPromptBuilder 186 from haystack.components.generators.chat import OpenAIChatGenerator 187 from haystack.dataclasses import ChatMessage 188 from haystack import Pipeline 189 190 # no parameter init, we don't use any runtime template variables 191 prompt_builder = ChatPromptBuilder() 192 llm = OpenAIChatGenerator(model="gpt-5-mini") 193 194 pipe = Pipeline() 195 pipe.add_component("prompt_builder", prompt_builder) 196 pipe.add_component("llm", llm) 197 pipe.connect("prompt_builder.prompt", "llm.messages") 198 199 location = "Berlin" 200 language = "English" 201 system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}") 202 messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")] 203 204 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language}, 205 "template": messages}}) 206 print(res) 207 # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 208 # "Berlin is the capital city of Germany and one of the most vibrant 209 # and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic 210 # capital of Germany!")], _name=None, _meta={'model': 'gpt-5-mini', 211 # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens': 212 # 708}})]}} 213 214 messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next {{day_count}} days?")] 215 216 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"}, 217 "template": messages}}) 218 219 print(res) 220 # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 221 # "Here is the weather forecast for Berlin in the next 5 222 # days:\n\nDay 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates 223 # closer to your visit.")], _name=None, _meta={'model': 'gpt-5-mini', 224 # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201, 225 # 'total_tokens': 238}})]}} 226 ``` 227 228 #### String prompt template 229 230 ```python 231 from haystack.components.builders import ChatPromptBuilder 232 from haystack.dataclasses.image_content import ImageContent 233 234 template = """ 235 {% message role="system" %} 236 You are a helpful assistant. 237 {% endmessage %} 238 239 {% message role="user" %} 240 Hello! I am {{user_name}}. What's the difference between the following images? 241 {% for image in images %} 242 {{ image | templatize_part }} 243 {% endfor %} 244 {% endmessage %} 245 """ 246 247 images = [ImageContent.from_file_path("test/test_files/images/apple.jpg"), 248 ImageContent.from_file_path("test/test_files/images/haystack-logo.png")] 249 250 builder = ChatPromptBuilder(template=template) 251 builder.run(user_name="John", images=images) 252 ``` 253 254 #### __init__ 255 256 ```python 257 __init__( 258 template: list[ChatMessage] | str | None = None, 259 required_variables: list[str] | Literal["*"] | None = None, 260 variables: list[str] | None = None, 261 ) -> None 262 ``` 263 264 Constructs a ChatPromptBuilder component. 265 266 **Parameters:** 267 268 - **template** (<code>list\[ChatMessage\] | str | None</code>) – A list of `ChatMessage` objects or a string template. The component looks for Jinja2 template syntax and 269 renders the prompt with the provided variables. Provide the template in either 270 the `init` method`or the`run\` method. 271 - **required_variables** (<code>list\[str\] | Literal['\*'] | None</code>) – List variables that must be provided as input to ChatPromptBuilder. 272 If a variable listed as required is not provided, an exception is raised. 273 If set to `"*"`, all variables found in the prompt are required. Optional. 274 - **variables** (<code>list\[str\] | None</code>) – List input variables to use in prompt templates instead of the ones inferred from the 275 `template` parameter. For example, to use more variables during prompt engineering than the ones present 276 in the default template, you can provide them here. 277 278 #### run 279 280 ```python 281 run( 282 template: list[ChatMessage] | str | None = None, 283 template_variables: dict[str, Any] | None = None, 284 **kwargs: Any 285 ) -> dict[str, list[ChatMessage]] 286 ``` 287 288 Renders the prompt template with the provided variables. 289 290 It applies the template variables to render the final prompt. You can provide variables with pipeline kwargs. 291 To overwrite the default template, you can set the `template` parameter. 292 To overwrite pipeline kwargs, you can set the `template_variables` parameter. 293 294 **Parameters:** 295 296 - **template** (<code>list\[ChatMessage\] | str | None</code>) – An optional list of `ChatMessage` objects or string template to overwrite ChatPromptBuilder's default 297 template. 298 If `None`, the default template provided at initialization is used. 299 - **template_variables** (<code>dict\[str, Any\] | None</code>) – An optional dictionary of template variables to overwrite the pipeline variables. 300 - **kwargs** (<code>Any</code>) – Pipeline variables used for rendering the prompt. 301 302 **Returns:** 303 304 - <code>dict\[str, list\[ChatMessage\]\]</code> – A dictionary with the following keys: 305 - `prompt`: The updated list of `ChatMessage` objects after rendering the templates. 306 307 **Raises:** 308 309 - <code>ValueError</code> – If `chat_messages` is empty or contains elements that are not instances of `ChatMessage`. 310 311 #### to_dict 312 313 ```python 314 to_dict() -> dict[str, Any] 315 ``` 316 317 Returns a dictionary representation of the component. 318 319 **Returns:** 320 321 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the component. 322 323 #### from_dict 324 325 ```python 326 from_dict(data: dict[str, Any]) -> ChatPromptBuilder 327 ``` 328 329 Deserialize this component from a dictionary. 330 331 **Parameters:** 332 333 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize and create the component. 334 335 **Returns:** 336 337 - <code>ChatPromptBuilder</code> – The deserialized component. 338 339 ## prompt_builder 340 341 ### PromptBuilder 342 343 Renders a prompt filling in any variables so that it can send it to a Generator. 344 345 The prompt uses Jinja2 template syntax. 346 The variables in the default template are used as PromptBuilder's input and are all optional. 347 If they're not provided, they're replaced with an empty string in the rendered prompt. 348 To try out different prompts, you can replace the prompt template at runtime by 349 providing a template for each pipeline run invocation. 350 351 ### Usage examples 352 353 #### On its own 354 355 This example uses PromptBuilder to render a prompt template and fill it with `target_language` 356 and `snippet`. PromptBuilder returns a prompt with the string "Translate the following context to Spanish. 357 Context: I can't speak Spanish.; Translation:". 358 359 ```python 360 from haystack.components.builders import PromptBuilder 361 362 template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:" 363 builder = PromptBuilder(template=template) 364 builder.run(target_language="spanish", snippet="I can't speak spanish.") 365 ``` 366 367 #### In a Pipeline 368 369 This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it 370 with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator. 371 372 ```python 373 from haystack import Pipeline, Document 374 from haystack.utils import Secret 375 from haystack.components.generators import OpenAIGenerator 376 from haystack.components.builders.prompt_builder import PromptBuilder 377 378 # in a real world use case documents could come from a retriever, web, or any other source 379 documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")] 380 prompt_template = """ 381 Given these documents, answer the question. 382 Documents: 383 {% for doc in documents %} 384 {{ doc.content }} 385 {% endfor %} 386 387 Question: {{query}} 388 Answer: 389 """ 390 p = Pipeline() 391 p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder") 392 p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm") 393 p.connect("prompt_builder", "llm") 394 395 question = "Where does Joe live?" 396 result = p.run({"prompt_builder": {"documents": documents, "query": question}}) 397 print(result) 398 ``` 399 400 #### Changing the template at runtime (prompt engineering) 401 402 You can change the prompt template of an existing pipeline, like in this example: 403 404 ```python 405 documents = [ 406 Document(content="Joe lives in Berlin", meta={"name": "doc1"}), 407 Document(content="Joe is a software engineer", meta={"name": "doc1"}), 408 ] 409 new_template = """ 410 You are a helpful assistant. 411 Given these documents, answer the question. 412 Documents: 413 {% for doc in documents %} 414 Document {{ loop.index }}: 415 Document name: {{ doc.meta['name'] }} 416 {{ doc.content }} 417 {% endfor %} 418 419 Question: {{ query }} 420 Answer: 421 """ 422 p.run({ 423 "prompt_builder": { 424 "documents": documents, 425 "query": question, 426 "template": new_template, 427 }, 428 }) 429 ``` 430 431 To replace the variables in the default template when testing your prompt, 432 pass the new variables in the `variables` parameter. 433 434 #### Overwriting variables at runtime 435 436 To overwrite the values of variables, use `template_variables` during runtime: 437 438 ```python 439 language_template = """ 440 You are a helpful assistant. 441 Given these documents, answer the question. 442 Documents: 443 {% for doc in documents %} 444 Document {{ loop.index }}: 445 Document name: {{ doc.meta['name'] }} 446 {{ doc.content }} 447 {% endfor %} 448 449 Question: {{ query }} 450 Please provide your answer in {{ answer_language | default('English') }} 451 Answer: 452 """ 453 p.run({ 454 "prompt_builder": { 455 "documents": documents, 456 "query": question, 457 "template": language_template, 458 "template_variables": {"answer_language": "German"}, 459 }, 460 }) 461 ``` 462 463 Note that `language_template` introduces variable `answer_language` which is not bound to any pipeline variable. 464 If not set otherwise, it will use its default value 'English'. 465 This example overwrites its value to 'German'. 466 Use `template_variables` to overwrite pipeline variables (such as documents) as well. 467 468 #### __init__ 469 470 ```python 471 __init__( 472 template: str, 473 required_variables: list[str] | Literal["*"] | None = None, 474 variables: list[str] | None = None, 475 ) -> None 476 ``` 477 478 Constructs a PromptBuilder component. 479 480 **Parameters:** 481 482 - **template** (<code>str</code>) – A prompt template that uses Jinja2 syntax to add variables. For example: 483 `"Summarize this document: {{ documents[0].content }}\nSummary:"` 484 It's used to render the prompt. 485 The variables in the default template are input for PromptBuilder and are all optional, 486 unless explicitly specified. 487 If an optional variable is not provided, it's replaced with an empty string in the rendered prompt. 488 - **required_variables** (<code>list\[str\] | Literal['\*'] | None</code>) – List variables that must be provided as input to PromptBuilder. 489 If a variable listed as required is not provided, an exception is raised. 490 If set to `"*"`, all variables found in the prompt are required. Optional. 491 - **variables** (<code>list\[str\] | None</code>) – List input variables to use in prompt templates instead of the ones inferred from the 492 `template` parameter. For example, to use more variables during prompt engineering than the ones present 493 in the default template, you can provide them here. 494 495 #### to_dict 496 497 ```python 498 to_dict() -> dict[str, Any] 499 ``` 500 501 Returns a dictionary representation of the component. 502 503 **Returns:** 504 505 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the component. 506 507 #### run 508 509 ```python 510 run( 511 template: str | None = None, 512 template_variables: dict[str, Any] | None = None, 513 **kwargs: Any 514 ) -> dict[str, Any] 515 ``` 516 517 Renders the prompt template with the provided variables. 518 519 It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs. 520 In order to overwrite the default template, you can set the `template` parameter. 521 In order to overwrite pipeline kwargs, you can set the `template_variables` parameter. 522 523 **Parameters:** 524 525 - **template** (<code>str | None</code>) – An optional string template to overwrite PromptBuilder's default template. If None, the default template 526 provided at initialization is used. 527 - **template_variables** (<code>dict\[str, Any\] | None</code>) – An optional dictionary of template variables to overwrite the pipeline variables. 528 - **kwargs** (<code>Any</code>) – Pipeline variables used for rendering the prompt. 529 530 **Returns:** 531 532 - <code>dict\[str, Any\]</code> – A dictionary with the following keys: 533 - `prompt`: The updated prompt text after rendering the prompt template. 534 535 **Raises:** 536 537 - <code>ValueError</code> – If any of the required template variables is not provided.