builders_api.md
1 --- 2 title: "Builders" 3 id: builders-api 4 description: "Extract the output of a Generator to an Answer format, and build prompts." 5 slug: "/builders-api" 6 --- 7 8 9 ## answer_builder 10 11 ### AnswerBuilder 12 13 Converts a query and Generator replies into a `GeneratedAnswer` object. 14 15 AnswerBuilder parses Generator replies using custom regular expressions. 16 Check out the usage example below to see how it works. 17 Optionally, it can also take documents and metadata from the Generator to add to the `GeneratedAnswer` object. 18 AnswerBuilder works with both non-chat and chat Generators. 19 20 ### Usage example 21 22 ```python 23 from haystack.components.builders import AnswerBuilder 24 25 builder = AnswerBuilder(pattern="Answer: (.*)") 26 builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."]) 27 ``` 28 29 ### Usage example with documents and reference pattern 30 31 ```python 32 from haystack import Document 33 from haystack.components.builders import AnswerBuilder 34 35 replies = ["The capital of France is Paris [2]."] 36 37 docs = [ 38 Document(content="Berlin is the capital of Germany."), 39 Document(content="Paris is the capital of France."), 40 Document(content="Rome is the capital of Italy."), 41 ] 42 43 builder = AnswerBuilder(reference_pattern="\[(\d+)\]", return_only_referenced_documents=False) 44 result = builder.run(query="What is the capital of France?", replies=replies, documents=docs)["answers"][0] 45 46 print(f"Answer: {result.data}") 47 print("References:") 48 for doc in result.documents: 49 if doc.meta["referenced"]: 50 print(f"[{doc.meta['source_index']}] {doc.content}") 51 print("Other sources:") 52 for doc in result.documents: 53 if not doc.meta["referenced"]: 54 print(f"[{doc.meta['source_index']}] {doc.content}") 55 56 # Answer: The capital of France is Paris 57 # References: 58 # [2] Paris is the capital of France. 59 # Other sources: 60 # [1] Berlin is the capital of Germany. 61 # [3] Rome is the capital of Italy. 62 ``` 63 64 #### __init__ 65 66 ```python 67 __init__( 68 pattern: str | None = None, 69 reference_pattern: str | None = None, 70 last_message_only: bool = False, 71 *, 72 return_only_referenced_documents: bool = True 73 ) 74 ``` 75 76 Creates an instance of the AnswerBuilder component. 77 78 **Parameters:** 79 80 - **pattern** (<code>str | None</code>) – The regular expression pattern to extract the answer text from the Generator. 81 If not specified, the entire response is used as the answer. 82 The regular expression can have one capture group at most. 83 If present, the capture group text 84 is used as the answer. If no capture group is present, the whole match is used as the answer. 85 Examples: 86 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\\nthis is an answer". 87 `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer". 88 - **reference_pattern** (<code>str | None</code>) – The regular expression pattern used for parsing the document references. 89 If not specified, no parsing is done, and all documents are returned. 90 References need to be specified as indices of the input documents and start at [1]. 91 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 92 If this parameter is provided, documents metadata will contain a "referenced" key with a boolean value. 93 - **last_message_only** (<code>bool</code>) – If False (default value), all messages are used as the answer. 94 If True, only the last message is used as the answer. 95 - **return_only_referenced_documents** (<code>bool</code>) – To be used in conjunction with `reference_pattern`. 96 If True (default value), only the documents that were actually referenced in `replies` are returned. 97 If False, all documents are returned. 98 If `reference_pattern` is not provided, this parameter has no effect, and all documents are returned. 99 100 #### run 101 102 ```python 103 run( 104 query: str, 105 replies: list[str] | list[ChatMessage], 106 meta: list[dict[str, Any]] | None = None, 107 documents: list[Document] | None = None, 108 pattern: str | None = None, 109 reference_pattern: str | None = None, 110 ) 111 ``` 112 113 Turns the output of a Generator into `GeneratedAnswer` objects using regular expressions. 114 115 **Parameters:** 116 117 - **query** (<code>str</code>) – The input query used as the Generator prompt. 118 - **replies** (<code>list\[str\] | list\[ChatMessage\]</code>) – The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects. 119 - **meta** (<code>list\[dict\[str, Any\]\] | None</code>) – The metadata returned by the Generator. If not specified, the generated answer will contain no metadata. 120 - **documents** (<code>list\[Document\] | None</code>) – The documents used as the Generator inputs. If specified, they are added to 121 the `GeneratedAnswer` objects. 122 Each Document.meta includes a "source_index" key, representing its 1-based position in the input list. 123 When `reference_pattern` is provided: 124 - "referenced" key is added to the Document.meta, indicating if the document was referenced in the output. 125 - `return_only_referenced_documents` init parameter controls if all or only referenced documents are 126 returned. 127 - **pattern** (<code>str | None</code>) – The regular expression pattern to extract the answer text from the Generator. 128 If not specified, the entire response is used as the answer. 129 The regular expression can have one capture group at most. 130 If present, the capture group text 131 is used as the answer. If no capture group is present, the whole match is used as the answer. 132 Examples: 133 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\\nthis is an answer". 134 `Answer: (.*)` finds "this is an answer" in a string 135 "this is an argument. Answer: this is an answer". 136 - **reference_pattern** (<code>str | None</code>) – The regular expression pattern used for parsing the document references. 137 If not specified, no parsing is done, and all documents are returned. 138 References need to be specified as indices of the input documents and start at [1]. 139 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 140 141 **Returns:** 142 143 - – A dictionary with the following keys: 144 - `answers`: The answers received from the output of the Generator. 145 146 ## chat_prompt_builder 147 148 ### ChatPromptBuilder 149 150 Renders a chat prompt from a template using Jinja2 syntax. 151 152 A template can be a list of `ChatMessage` objects, or a special string, as shown in the usage examples. 153 154 It constructs prompts using static or dynamic templates, which you can update for each pipeline run. 155 156 Template variables in the template are optional unless specified otherwise. 157 If an optional variable isn't provided, it defaults to an empty string. Use `variable` and `required_variables` 158 to define input types and required variables. 159 160 ### Usage examples 161 162 #### Static ChatMessage prompt template 163 164 ```python 165 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 166 builder = ChatPromptBuilder(template=template) 167 builder.run(target_language="spanish", snippet="I can't speak spanish.") 168 ``` 169 170 #### Overriding static ChatMessage template at runtime 171 172 ```python 173 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 174 builder = ChatPromptBuilder(template=template) 175 builder.run(target_language="spanish", snippet="I can't speak spanish.") 176 177 msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:" 178 summary_template = [ChatMessage.from_user(msg)] 179 builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template) 180 ``` 181 182 #### Dynamic ChatMessage prompt template 183 184 ```python 185 from haystack.components.builders import ChatPromptBuilder 186 from haystack.components.generators.chat import OpenAIChatGenerator 187 from haystack.dataclasses import ChatMessage 188 from haystack import Pipeline 189 190 # no parameter init, we don't use any runtime template variables 191 prompt_builder = ChatPromptBuilder() 192 llm = OpenAIChatGenerator(model="gpt-5-mini") 193 194 pipe = Pipeline() 195 pipe.add_component("prompt_builder", prompt_builder) 196 pipe.add_component("llm", llm) 197 pipe.connect("prompt_builder.prompt", "llm.messages") 198 199 location = "Berlin" 200 language = "English" 201 system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}") 202 messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")] 203 204 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language}, 205 "template": messages}}) 206 print(res) 207 # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 208 # "Berlin is the capital city of Germany and one of the most vibrant 209 # and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic 210 # capital of Germany!")], _name=None, _meta={'model': 'gpt-5-mini', 211 # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens': 212 # 708}})]}} 213 214 messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next 215 {{day_count}} days?")] 216 217 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"}, 218 "template": messages}}) 219 220 print(res) 221 # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 222 # "Here is the weather forecast for Berlin in the next 5 223 # days:\n\nDay 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates 224 # closer to your visit.")], _name=None, _meta={'model': 'gpt-5-mini', 225 # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201, 226 # 'total_tokens': 238}})]}} 227 ``` 228 229 #### String prompt template 230 231 ```python 232 from haystack.components.builders import ChatPromptBuilder 233 from haystack.dataclasses.image_content import ImageContent 234 235 template = """ 236 {% message role="system" %} 237 You are a helpful assistant. 238 {% endmessage %} 239 240 {% message role="user" %} 241 Hello! I am {{user_name}}. What's the difference between the following images? 242 {% for image in images %} 243 {{ image | templatize_part }} 244 {% endfor %} 245 {% endmessage %} 246 """ 247 248 images = [ImageContent.from_file_path("test/test_files/images/apple.jpg"), 249 ImageContent.from_file_path("test/test_files/images/haystack-logo.png")] 250 251 builder = ChatPromptBuilder(template=template) 252 builder.run(user_name="John", images=images) 253 ``` 254 255 #### __init__ 256 257 ```python 258 __init__( 259 template: list[ChatMessage] | str | None = None, 260 required_variables: list[str] | Literal["*"] | None = None, 261 variables: list[str] | None = None, 262 ) 263 ``` 264 265 Constructs a ChatPromptBuilder component. 266 267 **Parameters:** 268 269 - **template** (<code>list\[ChatMessage\] | str | None</code>) – A list of `ChatMessage` objects or a string template. The component looks for Jinja2 template syntax and 270 renders the prompt with the provided variables. Provide the template in either 271 the `init` method`or the`run\` method. 272 - **required_variables** (<code>list\[str\] | Literal['\*'] | None</code>) – List variables that must be provided as input to ChatPromptBuilder. 273 If a variable listed as required is not provided, an exception is raised. 274 If set to `"*"`, all variables found in the prompt are required. Optional. 275 - **variables** (<code>list\[str\] | None</code>) – List input variables to use in prompt templates instead of the ones inferred from the 276 `template` parameter. For example, to use more variables during prompt engineering than the ones present 277 in the default template, you can provide them here. 278 279 #### run 280 281 ```python 282 run( 283 template: list[ChatMessage] | str | None = None, 284 template_variables: dict[str, Any] | None = None, 285 **kwargs: dict[str, Any] | None 286 ) -> dict[str, list[ChatMessage]] 287 ``` 288 289 Renders the prompt template with the provided variables. 290 291 It applies the template variables to render the final prompt. You can provide variables with pipeline kwargs. 292 To overwrite the default template, you can set the `template` parameter. 293 To overwrite pipeline kwargs, you can set the `template_variables` parameter. 294 295 **Parameters:** 296 297 - **template** (<code>list\[ChatMessage\] | str | None</code>) – An optional list of `ChatMessage` objects or string template to overwrite ChatPromptBuilder's default 298 template. 299 If `None`, the default template provided at initialization is used. 300 - **template_variables** (<code>dict\[str, Any\] | None</code>) – An optional dictionary of template variables to overwrite the pipeline variables. 301 - **kwargs** – Pipeline variables used for rendering the prompt. 302 303 **Returns:** 304 305 - <code>dict\[str, list\[ChatMessage\]\]</code> – A dictionary with the following keys: 306 - `prompt`: The updated list of `ChatMessage` objects after rendering the templates. 307 308 **Raises:** 309 310 - <code>ValueError</code> – If `chat_messages` is empty or contains elements that are not instances of `ChatMessage`. 311 312 #### to_dict 313 314 ```python 315 to_dict() -> dict[str, Any] 316 ``` 317 318 Returns a dictionary representation of the component. 319 320 **Returns:** 321 322 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the component. 323 324 #### from_dict 325 326 ```python 327 from_dict(data: dict[str, Any]) -> ChatPromptBuilder 328 ``` 329 330 Deserialize this component from a dictionary. 331 332 **Parameters:** 333 334 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize and create the component. 335 336 **Returns:** 337 338 - <code>ChatPromptBuilder</code> – The deserialized component. 339 340 ## prompt_builder 341 342 ### PromptBuilder 343 344 Renders a prompt filling in any variables so that it can send it to a Generator. 345 346 The prompt uses Jinja2 template syntax. 347 The variables in the default template are used as PromptBuilder's input and are all optional. 348 If they're not provided, they're replaced with an empty string in the rendered prompt. 349 To try out different prompts, you can replace the prompt template at runtime by 350 providing a template for each pipeline run invocation. 351 352 ### Usage examples 353 354 #### On its own 355 356 This example uses PromptBuilder to render a prompt template and fill it with `target_language` 357 and `snippet`. PromptBuilder returns a prompt with the string "Translate the following context to Spanish. 358 Context: I can't speak Spanish.; Translation:". 359 360 ```python 361 from haystack.components.builders import PromptBuilder 362 363 template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:" 364 builder = PromptBuilder(template=template) 365 builder.run(target_language="spanish", snippet="I can't speak spanish.") 366 ``` 367 368 #### In a Pipeline 369 370 This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it 371 with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator. 372 373 ```python 374 from haystack import Pipeline, Document 375 from haystack.utils import Secret 376 from haystack.components.generators import OpenAIGenerator 377 from haystack.components.builders.prompt_builder import PromptBuilder 378 379 # in a real world use case documents could come from a retriever, web, or any other source 380 documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")] 381 prompt_template = """ 382 Given these documents, answer the question. 383 Documents: 384 {% for doc in documents %} 385 {{ doc.content }} 386 {% endfor %} 387 388 Question: {{query}} 389 Answer: 390 """ 391 p = Pipeline() 392 p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder") 393 p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm") 394 p.connect("prompt_builder", "llm") 395 396 question = "Where does Joe live?" 397 result = p.run({"prompt_builder": {"documents": documents, "query": question}}) 398 print(result) 399 ``` 400 401 #### Changing the template at runtime (prompt engineering) 402 403 You can change the prompt template of an existing pipeline, like in this example: 404 405 ```python 406 documents = [ 407 Document(content="Joe lives in Berlin", meta={"name": "doc1"}), 408 Document(content="Joe is a software engineer", meta={"name": "doc1"}), 409 ] 410 new_template = """ 411 You are a helpful assistant. 412 Given these documents, answer the question. 413 Documents: 414 {% for doc in documents %} 415 Document {{ loop.index }}: 416 Document name: {{ doc.meta['name'] }} 417 {{ doc.content }} 418 {% endfor %} 419 420 Question: {{ query }} 421 Answer: 422 """ 423 p.run({ 424 "prompt_builder": { 425 "documents": documents, 426 "query": question, 427 "template": new_template, 428 }, 429 }) 430 ``` 431 432 To replace the variables in the default template when testing your prompt, 433 pass the new variables in the `variables` parameter. 434 435 #### Overwriting variables at runtime 436 437 To overwrite the values of variables, use `template_variables` during runtime: 438 439 ```python 440 language_template = """ 441 You are a helpful assistant. 442 Given these documents, answer the question. 443 Documents: 444 {% for doc in documents %} 445 Document {{ loop.index }}: 446 Document name: {{ doc.meta['name'] }} 447 {{ doc.content }} 448 {% endfor %} 449 450 Question: {{ query }} 451 Please provide your answer in {{ answer_language | default('English') }} 452 Answer: 453 """ 454 p.run({ 455 "prompt_builder": { 456 "documents": documents, 457 "query": question, 458 "template": language_template, 459 "template_variables": {"answer_language": "German"}, 460 }, 461 }) 462 ``` 463 464 Note that `language_template` introduces variable `answer_language` which is not bound to any pipeline variable. 465 If not set otherwise, it will use its default value 'English'. 466 This example overwrites its value to 'German'. 467 Use `template_variables` to overwrite pipeline variables (such as documents) as well. 468 469 #### __init__ 470 471 ```python 472 __init__( 473 template: str, 474 required_variables: list[str] | Literal["*"] | None = None, 475 variables: list[str] | None = None, 476 ) 477 ``` 478 479 Constructs a PromptBuilder component. 480 481 **Parameters:** 482 483 - **template** (<code>str</code>) – A prompt template that uses Jinja2 syntax to add variables. For example: 484 `"Summarize this document: {{ documents[0].content }}\nSummary:"` 485 It's used to render the prompt. 486 The variables in the default template are input for PromptBuilder and are all optional, 487 unless explicitly specified. 488 If an optional variable is not provided, it's replaced with an empty string in the rendered prompt. 489 - **required_variables** (<code>list\[str\] | Literal['\*'] | None</code>) – List variables that must be provided as input to PromptBuilder. 490 If a variable listed as required is not provided, an exception is raised. 491 If set to `"*"`, all variables found in the prompt are required. Optional. 492 - **variables** (<code>list\[str\] | None</code>) – List input variables to use in prompt templates instead of the ones inferred from the 493 `template` parameter. For example, to use more variables during prompt engineering than the ones present 494 in the default template, you can provide them here. 495 496 #### to_dict 497 498 ```python 499 to_dict() -> dict[str, Any] 500 ``` 501 502 Returns a dictionary representation of the component. 503 504 **Returns:** 505 506 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the component. 507 508 #### run 509 510 ```python 511 run( 512 template: str | None = None, 513 template_variables: dict[str, Any] | None = None, 514 **kwargs: dict[str, Any] | None 515 ) 516 ``` 517 518 Renders the prompt template with the provided variables. 519 520 It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs. 521 In order to overwrite the default template, you can set the `template` parameter. 522 In order to overwrite pipeline kwargs, you can set the `template_variables` parameter. 523 524 **Parameters:** 525 526 - **template** (<code>str | None</code>) – An optional string template to overwrite PromptBuilder's default template. If None, the default template 527 provided at initialization is used. 528 - **template_variables** (<code>dict\[str, Any\] | None</code>) – An optional dictionary of template variables to overwrite the pipeline variables. 529 - **kwargs** – Pipeline variables used for rendering the prompt. 530 531 **Returns:** 532 533 - – A dictionary with the following keys: 534 - `prompt`: The updated prompt text after rendering the prompt template. 535 536 **Raises:** 537 538 - <code>ValueError</code> – If any of the required template variables is not provided.