builders_api.md
1 --- 2 title: "Builders" 3 id: builders-api 4 description: "Extract the output of a Generator to an Answer format, and build prompts." 5 slug: "/builders-api" 6 --- 7 8 <a id="answer_builder"></a> 9 10 ## Module answer\_builder 11 12 <a id="answer_builder.AnswerBuilder"></a> 13 14 ### AnswerBuilder 15 16 Converts a query and Generator replies into a `GeneratedAnswer` object. 17 18 AnswerBuilder parses Generator replies using custom regular expressions. 19 Check out the usage example below to see how it works. 20 Optionally, it can also take documents and metadata from the Generator to add to the `GeneratedAnswer` object. 21 AnswerBuilder works with both non-chat and chat Generators. 22 23 ### Usage example 24 25 26 ### Usage example with documents and reference pattern 27 28 ```python 29 from haystack.components.builders import AnswerBuilder 30 31 builder = AnswerBuilder(pattern="Answer: (.*)") 32 builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."]) 33 ``` 34 ```python 35 from haystack import Document 36 from haystack.components.builders import AnswerBuilder 37 38 replies = ["The capital of France is Paris [2]."] 39 40 docs = [ 41 Document(content="Berlin is the capital of Germany."), 42 Document(content="Paris is the capital of France."), 43 Document(content="Rome is the capital of Italy."), 44 ] 45 46 builder = AnswerBuilder(reference_pattern="\[(\d+)\]", return_only_referenced_documents=False) 47 result = builder.run(query="What is the capital of France?", replies=replies, documents=docs)["answers"][0] 48 49 print(f"Answer: {result.data}") 50 print("References:") 51 for doc in result.documents: 52 if doc.meta["referenced"]: 53 print(f"[{doc.meta['source_index']}] {doc.content}") 54 print("Other sources:") 55 for doc in result.documents: 56 if not doc.meta["referenced"]: 57 print(f"[{doc.meta['source_index']}] {doc.content}") 58 59 # Answer: The capital of France is Paris 60 # References: 61 # [2] Paris is the capital of France. 62 # Other sources: 63 # [1] Berlin is the capital of Germany. 64 # [3] Rome is the capital of Italy. 65 ``` 66 67 <a id="answer_builder.AnswerBuilder.__init__"></a> 68 69 #### AnswerBuilder.\_\_init\_\_ 70 71 ```python 72 def __init__(pattern: str | None = None, 73 reference_pattern: str | None = None, 74 last_message_only: bool = False, 75 *, 76 return_only_referenced_documents: bool = True) 77 ``` 78 79 Creates an instance of the AnswerBuilder component. 80 81 **Arguments**: 82 83 - `pattern`: The regular expression pattern to extract the answer text from the Generator. 84 If not specified, the entire response is used as the answer. 85 The regular expression can have one capture group at most. 86 If present, the capture group text 87 is used as the answer. If no capture group is present, the whole match is used as the answer. 88 Examples: 89 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". 90 `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer". 91 - `reference_pattern`: The regular expression pattern used for parsing the document references. 92 If not specified, no parsing is done, and all documents are returned. 93 References need to be specified as indices of the input documents and start at [1]. 94 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 95 If this parameter is provided, documents metadata will contain a "referenced" key with a boolean value. 96 - `last_message_only`: If False (default value), all messages are used as the answer. 97 If True, only the last message is used as the answer. 98 - `return_only_referenced_documents`: To be used in conjunction with `reference_pattern`. 99 If True (default value), only the documents that were actually referenced in `replies` are returned. 100 If False, all documents are returned. 101 If `reference_pattern` is not provided, this parameter has no effect, and all documents are returned. 102 103 <a id="answer_builder.AnswerBuilder.run"></a> 104 105 #### AnswerBuilder.run 106 107 ```python 108 @component.output_types(answers=list[GeneratedAnswer]) 109 def run(query: str, 110 replies: list[str] | list[ChatMessage], 111 meta: list[dict[str, Any]] | None = None, 112 documents: list[Document] | None = None, 113 pattern: str | None = None, 114 reference_pattern: str | None = None) 115 ``` 116 117 Turns the output of a Generator into `GeneratedAnswer` objects using regular expressions. 118 119 **Arguments**: 120 121 - `query`: The input query used as the Generator prompt. 122 - `replies`: The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects. 123 - `meta`: The metadata returned by the Generator. If not specified, the generated answer will contain no metadata. 124 - `documents`: The documents used as the Generator inputs. If specified, they are added to 125 the `GeneratedAnswer` objects. 126 Each Document.meta includes a "source_index" key, representing its 1-based position in the input list. 127 When `reference_pattern` is provided: 128 - "referenced" key is added to the Document.meta, indicating if the document was referenced in the output. 129 - `return_only_referenced_documents` init parameter controls if all or only referenced documents are 130 returned. 131 - `pattern`: The regular expression pattern to extract the answer text from the Generator. 132 If not specified, the entire response is used as the answer. 133 The regular expression can have one capture group at most. 134 If present, the capture group text 135 is used as the answer. If no capture group is present, the whole match is used as the answer. 136 Examples: 137 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". 138 `Answer: (.*)` finds "this is an answer" in a string 139 "this is an argument. Answer: this is an answer". 140 - `reference_pattern`: The regular expression pattern used for parsing the document references. 141 If not specified, no parsing is done, and all documents are returned. 142 References need to be specified as indices of the input documents and start at [1]. 143 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 144 145 **Returns**: 146 147 A dictionary with the following keys: 148 - `answers`: The answers received from the output of the Generator. 149 150 <a id="chat_prompt_builder"></a> 151 152 ## Module chat\_prompt\_builder 153 154 <a id="chat_prompt_builder.ChatPromptBuilder"></a> 155 156 ### ChatPromptBuilder 157 158 Renders a chat prompt from a template using Jinja2 syntax. 159 160 A template can be a list of `ChatMessage` objects, or a special string, as shown in the usage examples. 161 162 It constructs prompts using static or dynamic templates, which you can update for each pipeline run. 163 164 Template variables in the template are optional unless specified otherwise. 165 If an optional variable isn't provided, it defaults to an empty string. Use `variable` and `required_variables` 166 to define input types and required variables. 167 168 ### Usage examples 169 170 #### Static ChatMessage prompt template 171 172 ```python 173 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 174 builder = ChatPromptBuilder(template=template) 175 builder.run(target_language="spanish", snippet="I can't speak spanish.") 176 ``` 177 178 #### Overriding static ChatMessage template at runtime 179 180 ```python 181 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 182 builder = ChatPromptBuilder(template=template) 183 builder.run(target_language="spanish", snippet="I can't speak spanish.") 184 185 msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:" 186 summary_template = [ChatMessage.from_user(msg)] 187 builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template) 188 ``` 189 190 #### Dynamic ChatMessage prompt template 191 192 ```python 193 from haystack.components.builders import ChatPromptBuilder 194 from haystack.components.generators.chat import OpenAIChatGenerator 195 from haystack.dataclasses import ChatMessage 196 from haystack import Pipeline 197 198 # no parameter init, we don't use any runtime template variables 199 prompt_builder = ChatPromptBuilder() 200 llm = OpenAIChatGenerator(model="gpt-5-mini") 201 202 pipe = Pipeline() 203 pipe.add_component("prompt_builder", prompt_builder) 204 pipe.add_component("llm", llm) 205 pipe.connect("prompt_builder.prompt", "llm.messages") 206 207 location = "Berlin" 208 language = "English" 209 system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}") 210 messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")] 211 212 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language}, 213 "template": messages}}) 214 print(res) 215 # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 216 # "Berlin is the capital city of Germany and one of the most vibrant 217 # and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic 218 # capital of Germany!")], _name=None, _meta={'model': 'gpt-5-mini', 219 # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens': 220 # 708}})]}} 221 222 messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next 223 {{day_count}} days?")] 224 225 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"}, 226 "template": messages}}) 227 228 print(res) 229 # >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 230 # "Here is the weather forecast for Berlin in the next 5 231 # days:\n\nDay 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates 232 # closer to your visit.")], _name=None, _meta={'model': 'gpt-5-mini', 233 # 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201, 234 # 'total_tokens': 238}})]}} 235 ``` 236 237 #### String prompt template 238 ```python 239 from haystack.components.builders import ChatPromptBuilder 240 from haystack.dataclasses.image_content import ImageContent 241 242 template = """ 243 {% message role="system" %} 244 You are a helpful assistant. 245 {% endmessage %} 246 247 {% message role="user" %} 248 Hello! I am {{user_name}}. What's the difference between the following images? 249 {% for image in images %} 250 {{ image | templatize_part }} 251 {% endfor %} 252 {% endmessage %} 253 """ 254 255 images = [ImageContent.from_file_path("test/test_files/images/apple.jpg"), 256 ImageContent.from_file_path("test/test_files/images/haystack-logo.png")] 257 258 builder = ChatPromptBuilder(template=template) 259 builder.run(user_name="John", images=images) 260 ``` 261 262 <a id="chat_prompt_builder.ChatPromptBuilder.__init__"></a> 263 264 #### ChatPromptBuilder.\_\_init\_\_ 265 266 ```python 267 def __init__(template: list[ChatMessage] | str | None = None, 268 required_variables: list[str] | Literal["*"] | None = None, 269 variables: list[str] | None = None) 270 ``` 271 272 Constructs a ChatPromptBuilder component. 273 274 **Arguments**: 275 276 - `template`: A list of `ChatMessage` objects or a string template. The component looks for Jinja2 template syntax and 277 renders the prompt with the provided variables. Provide the template in either 278 the `init` method` or the `run` method. 279 - `required_variables`: List variables that must be provided as input to ChatPromptBuilder. 280 If a variable listed as required is not provided, an exception is raised. 281 If set to "*", all variables found in the prompt are required. Optional. 282 - `variables`: List input variables to use in prompt templates instead of the ones inferred from the 283 `template` parameter. For example, to use more variables during prompt engineering than the ones present 284 in the default template, you can provide them here. 285 286 <a id="chat_prompt_builder.ChatPromptBuilder.run"></a> 287 288 #### ChatPromptBuilder.run 289 290 ```python 291 @component.output_types(prompt=list[ChatMessage]) 292 def run(template: list[ChatMessage] | str | None = None, 293 template_variables: dict[str, Any] | None = None, 294 **kwargs) 295 ``` 296 297 Renders the prompt template with the provided variables. 298 299 It applies the template variables to render the final prompt. You can provide variables with pipeline kwargs. 300 To overwrite the default template, you can set the `template` parameter. 301 To overwrite pipeline kwargs, you can set the `template_variables` parameter. 302 303 **Arguments**: 304 305 - `template`: An optional list of `ChatMessage` objects or string template to overwrite ChatPromptBuilder's default 306 template. 307 If `None`, the default template provided at initialization is used. 308 - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables. 309 - `kwargs`: Pipeline variables used for rendering the prompt. 310 311 **Raises**: 312 313 - `ValueError`: If `chat_messages` is empty or contains elements that are not instances of `ChatMessage`. 314 315 **Returns**: 316 317 A dictionary with the following keys: 318 - `prompt`: The updated list of `ChatMessage` objects after rendering the templates. 319 320 <a id="chat_prompt_builder.ChatPromptBuilder.to_dict"></a> 321 322 #### ChatPromptBuilder.to\_dict 323 324 ```python 325 def to_dict() -> dict[str, Any] 326 ``` 327 328 Returns a dictionary representation of the component. 329 330 **Returns**: 331 332 Serialized dictionary representation of the component. 333 334 <a id="chat_prompt_builder.ChatPromptBuilder.from_dict"></a> 335 336 #### ChatPromptBuilder.from\_dict 337 338 ```python 339 @classmethod 340 def from_dict(cls, data: dict[str, Any]) -> "ChatPromptBuilder" 341 ``` 342 343 Deserialize this component from a dictionary. 344 345 **Arguments**: 346 347 - `data`: The dictionary to deserialize and create the component. 348 349 **Returns**: 350 351 The deserialized component. 352 353 <a id="prompt_builder"></a> 354 355 ## Module prompt\_builder 356 357 <a id="prompt_builder.PromptBuilder"></a> 358 359 ### PromptBuilder 360 361 Renders a prompt filling in any variables so that it can send it to a Generator. 362 363 The prompt uses Jinja2 template syntax. 364 The variables in the default template are used as PromptBuilder's input and are all optional. 365 If they're not provided, they're replaced with an empty string in the rendered prompt. 366 To try out different prompts, you can replace the prompt template at runtime by 367 providing a template for each pipeline run invocation. 368 369 ### Usage examples 370 371 #### On its own 372 373 This example uses PromptBuilder to render a prompt template and fill it with `target_language` 374 and `snippet`. PromptBuilder returns a prompt with the string "Translate the following context to Spanish. 375 Context: I can't speak Spanish.; Translation:". 376 ```python 377 from haystack.components.builders import PromptBuilder 378 379 template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:" 380 builder = PromptBuilder(template=template) 381 builder.run(target_language="spanish", snippet="I can't speak spanish.") 382 ``` 383 384 #### In a Pipeline 385 386 This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it 387 with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator. 388 ```python 389 from haystack import Pipeline, Document 390 from haystack.utils import Secret 391 from haystack.components.generators import OpenAIGenerator 392 from haystack.components.builders.prompt_builder import PromptBuilder 393 394 # in a real world use case documents could come from a retriever, web, or any other source 395 documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")] 396 prompt_template = """ 397 Given these documents, answer the question. 398 Documents: 399 {% for doc in documents %} 400 {{ doc.content }} 401 {% endfor %} 402 403 Question: {{query}} 404 Answer: 405 """ 406 p = Pipeline() 407 p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder") 408 p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm") 409 p.connect("prompt_builder", "llm") 410 411 question = "Where does Joe live?" 412 result = p.run({"prompt_builder": {"documents": documents, "query": question}}) 413 print(result) 414 ``` 415 416 #### Changing the template at runtime (prompt engineering) 417 418 You can change the prompt template of an existing pipeline, like in this example: 419 ```python 420 documents = [ 421 Document(content="Joe lives in Berlin", meta={"name": "doc1"}), 422 Document(content="Joe is a software engineer", meta={"name": "doc1"}), 423 ] 424 new_template = """ 425 You are a helpful assistant. 426 Given these documents, answer the question. 427 Documents: 428 {% for doc in documents %} 429 Document {{ loop.index }}: 430 Document name: {{ doc.meta['name'] }} 431 {{ doc.content }} 432 {% endfor %} 433 434 Question: {{ query }} 435 Answer: 436 """ 437 p.run({ 438 "prompt_builder": { 439 "documents": documents, 440 "query": question, 441 "template": new_template, 442 }, 443 }) 444 ``` 445 To replace the variables in the default template when testing your prompt, 446 pass the new variables in the `variables` parameter. 447 448 #### Overwriting variables at runtime 449 450 To overwrite the values of variables, use `template_variables` during runtime: 451 ```python 452 language_template = """ 453 You are a helpful assistant. 454 Given these documents, answer the question. 455 Documents: 456 {% for doc in documents %} 457 Document {{ loop.index }}: 458 Document name: {{ doc.meta['name'] }} 459 {{ doc.content }} 460 {% endfor %} 461 462 Question: {{ query }} 463 Please provide your answer in {{ answer_language | default('English') }} 464 Answer: 465 """ 466 p.run({ 467 "prompt_builder": { 468 "documents": documents, 469 "query": question, 470 "template": language_template, 471 "template_variables": {"answer_language": "German"}, 472 }, 473 }) 474 ``` 475 Note that `language_template` introduces variable `answer_language` which is not bound to any pipeline variable. 476 If not set otherwise, it will use its default value 'English'. 477 This example overwrites its value to 'German'. 478 Use `template_variables` to overwrite pipeline variables (such as documents) as well. 479 480 <a id="prompt_builder.PromptBuilder.__init__"></a> 481 482 #### PromptBuilder.\_\_init\_\_ 483 484 ```python 485 def __init__(template: str, 486 required_variables: list[str] | Literal["*"] | None = None, 487 variables: list[str] | None = None) 488 ``` 489 490 Constructs a PromptBuilder component. 491 492 **Arguments**: 493 494 - `template`: A prompt template that uses Jinja2 syntax to add variables. For example: 495 `"Summarize this document: {{ documents[0].content }}\nSummary:"` 496 It's used to render the prompt. 497 The variables in the default template are input for PromptBuilder and are all optional, 498 unless explicitly specified. 499 If an optional variable is not provided, it's replaced with an empty string in the rendered prompt. 500 - `required_variables`: List variables that must be provided as input to PromptBuilder. 501 If a variable listed as required is not provided, an exception is raised. 502 If set to "*", all variables found in the prompt are required. Optional. 503 - `variables`: List input variables to use in prompt templates instead of the ones inferred from the 504 `template` parameter. For example, to use more variables during prompt engineering than the ones present 505 in the default template, you can provide them here. 506 507 <a id="prompt_builder.PromptBuilder.to_dict"></a> 508 509 #### PromptBuilder.to\_dict 510 511 ```python 512 def to_dict() -> dict[str, Any] 513 ``` 514 515 Returns a dictionary representation of the component. 516 517 **Returns**: 518 519 Serialized dictionary representation of the component. 520 521 <a id="prompt_builder.PromptBuilder.run"></a> 522 523 #### PromptBuilder.run 524 525 ```python 526 @component.output_types(prompt=str) 527 def run(template: str | None = None, 528 template_variables: dict[str, Any] | None = None, 529 **kwargs) 530 ``` 531 532 Renders the prompt template with the provided variables. 533 534 It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs. 535 In order to overwrite the default template, you can set the `template` parameter. 536 In order to overwrite pipeline kwargs, you can set the `template_variables` parameter. 537 538 **Arguments**: 539 540 - `template`: An optional string template to overwrite PromptBuilder's default template. If None, the default template 541 provided at initialization is used. 542 - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables. 543 - `kwargs`: Pipeline variables used for rendering the prompt. 544 545 **Raises**: 546 547 - `ValueError`: If any of the required template variables is not provided. 548 549 **Returns**: 550 551 A dictionary with the following keys: 552 - `prompt`: The updated prompt text after rendering the prompt template. 553