builders_api.md
1 --- 2 title: "Builders" 3 id: builders-api 4 description: "Extract the output of a Generator to an Answer format, and build prompts." 5 slug: "/builders-api" 6 --- 7 8 <a id="answer_builder"></a> 9 10 ## Module answer\_builder 11 12 <a id="answer_builder.AnswerBuilder"></a> 13 14 ### AnswerBuilder 15 16 Converts a query and Generator replies into a `GeneratedAnswer` object. 17 18 AnswerBuilder parses Generator replies using custom regular expressions. 19 Check out the usage example below to see how it works. 20 Optionally, it can also take documents and metadata from the Generator to add to the `GeneratedAnswer` object. 21 AnswerBuilder works with both non-chat and chat Generators. 22 23 ### Usage example 24 25 26 ### Usage example with documents and reference pattern 27 28 ```python 29 from haystack.components.builders import AnswerBuilder 30 31 builder = AnswerBuilder(pattern="Answer: (.*)") 32 builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."]) 33 ``` 34 ```python 35 from haystack import Document 36 from haystack.components.builders import AnswerBuilder 37 38 replies = ["The capital of France is Paris [2]."] 39 40 docs = [ 41 Document(content="Berlin is the capital of Germany."), 42 Document(content="Paris is the capital of France."), 43 Document(content="Rome is the capital of Italy."), 44 ] 45 46 builder = AnswerBuilder(reference_pattern="\[(\d+)\]", return_only_referenced_documents=False) 47 result = builder.run(query="What is the capital of France?", replies=replies, documents=docs)["answers"][0] 48 49 print(f"Answer: {result.data}") 50 print("References:") 51 for doc in result.documents: 52 if doc.meta["referenced"]: 53 print(f"[{doc.meta['source_index']}] {doc.content}") 54 print("Other sources:") 55 for doc in result.documents: 56 if not doc.meta["referenced"]: 57 print(f"[{doc.meta['source_index']}] {doc.content}") 58 59 # Answer: The capital of France is Paris 60 # References: 61 # [2] Paris is the capital of France. 62 # Other sources: 63 # [1] Berlin is the capital of Germany. 64 # [3] Rome is the capital of Italy. 65 ``` 66 67 <a id="answer_builder.AnswerBuilder.__init__"></a> 68 69 #### AnswerBuilder.\_\_init\_\_ 70 71 ```python 72 def __init__(pattern: Optional[str] = None, 73 reference_pattern: Optional[str] = None, 74 last_message_only: bool = False, 75 *, 76 return_only_referenced_documents: bool = True) 77 ``` 78 79 Creates an instance of the AnswerBuilder component. 80 81 **Arguments**: 82 83 - `pattern`: The regular expression pattern to extract the answer text from the Generator. 84 If not specified, the entire response is used as the answer. 85 The regular expression can have one capture group at most. 86 If present, the capture group text 87 is used as the answer. If no capture group is present, the whole match is used as the answer. 88 Examples: 89 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". 90 `Answer: (.*)` finds "this is an answer" in a string "this is an argument. Answer: this is an answer". 91 - `reference_pattern`: The regular expression pattern used for parsing the document references. 92 If not specified, no parsing is done, and all documents are returned. 93 References need to be specified as indices of the input documents and start at [1]. 94 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 95 If this parameter is provided, documents metadata will contain a "referenced" key with a boolean value. 96 - `last_message_only`: If False (default value), all messages are used as the answer. 97 If True, only the last message is used as the answer. 98 - `return_only_referenced_documents`: To be used in conjunction with `reference_pattern`. 99 If True (default value), only the documents that were actually referenced in `replies` are returned. 100 If False, all documents are returned. 101 If `reference_pattern` is not provided, this parameter has no effect, and all documents are returned. 102 103 <a id="answer_builder.AnswerBuilder.run"></a> 104 105 #### AnswerBuilder.run 106 107 ```python 108 @component.output_types(answers=list[GeneratedAnswer]) 109 def run(query: str, 110 replies: Union[list[str], list[ChatMessage]], 111 meta: Optional[list[dict[str, Any]]] = None, 112 documents: Optional[list[Document]] = None, 113 pattern: Optional[str] = None, 114 reference_pattern: Optional[str] = None) 115 ``` 116 117 Turns the output of a Generator into `GeneratedAnswer` objects using regular expressions. 118 119 **Arguments**: 120 121 - `query`: The input query used as the Generator prompt. 122 - `replies`: The output of the Generator. Can be a list of strings or a list of `ChatMessage` objects. 123 - `meta`: The metadata returned by the Generator. If not specified, the generated answer will contain no metadata. 124 - `documents`: The documents used as the Generator inputs. If specified, they are added to 125 the `GeneratedAnswer` objects. 126 Each Document.meta includes a "source_index" key, representing its 1-based position in the input list. 127 When `reference_pattern` is provided: 128 - "referenced" key is added to the Document.meta, indicating if the document was referenced in the output. 129 - `return_only_referenced_documents` init parameter controls if all or only referenced documents are 130 returned. 131 - `pattern`: The regular expression pattern to extract the answer text from the Generator. 132 If not specified, the entire response is used as the answer. 133 The regular expression can have one capture group at most. 134 If present, the capture group text 135 is used as the answer. If no capture group is present, the whole match is used as the answer. 136 Examples: 137 `[^\n]+$` finds "this is an answer" in a string "this is an argument.\nthis is an answer". 138 `Answer: (.*)` finds "this is an answer" in a string 139 "this is an argument. Answer: this is an answer". 140 - `reference_pattern`: The regular expression pattern used for parsing the document references. 141 If not specified, no parsing is done, and all documents are returned. 142 References need to be specified as indices of the input documents and start at [1]. 143 Example: `\[(\d+)\]` finds "1" in a string "this is an answer[1]". 144 145 **Returns**: 146 147 A dictionary with the following keys: 148 - `answers`: The answers received from the output of the Generator. 149 150 <a id="prompt_builder"></a> 151 152 ## Module prompt\_builder 153 154 <a id="prompt_builder.PromptBuilder"></a> 155 156 ### PromptBuilder 157 158 Renders a prompt filling in any variables so that it can send it to a Generator. 159 160 The prompt uses Jinja2 template syntax. 161 The variables in the default template are used as PromptBuilder's input and are all optional. 162 If they're not provided, they're replaced with an empty string in the rendered prompt. 163 To try out different prompts, you can replace the prompt template at runtime by 164 providing a template for each pipeline run invocation. 165 166 ### Usage examples 167 168 #### On its own 169 170 This example uses PromptBuilder to render a prompt template and fill it with `target_language` 171 and `snippet`. PromptBuilder returns a prompt with the string "Translate the following context to Spanish. 172 Context: I can't speak Spanish.; Translation:". 173 ```python 174 from haystack.components.builders import PromptBuilder 175 176 template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:" 177 builder = PromptBuilder(template=template) 178 builder.run(target_language="spanish", snippet="I can't speak spanish.") 179 ``` 180 181 #### In a Pipeline 182 183 This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it 184 with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator. 185 ```python 186 from haystack import Pipeline, Document 187 from haystack.utils import Secret 188 from haystack.components.generators import OpenAIGenerator 189 from haystack.components.builders.prompt_builder import PromptBuilder 190 191 # in a real world use case documents could come from a retriever, web, or any other source 192 documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")] 193 prompt_template = """ 194 Given these documents, answer the question. 195 Documents: 196 {% for doc in documents %} 197 {{ doc.content }} 198 {% endfor %} 199 200 Question: {{query}} 201 Answer: 202 """ 203 p = Pipeline() 204 p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder") 205 p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm") 206 p.connect("prompt_builder", "llm") 207 208 question = "Where does Joe live?" 209 result = p.run({"prompt_builder": {"documents": documents, "query": question}}) 210 print(result) 211 ``` 212 213 #### Changing the template at runtime (prompt engineering) 214 215 You can change the prompt template of an existing pipeline, like in this example: 216 ```python 217 documents = [ 218 Document(content="Joe lives in Berlin", meta={"name": "doc1"}), 219 Document(content="Joe is a software engineer", meta={"name": "doc1"}), 220 ] 221 new_template = """ 222 You are a helpful assistant. 223 Given these documents, answer the question. 224 Documents: 225 {% for doc in documents %} 226 Document {{ loop.index }}: 227 Document name: {{ doc.meta['name'] }} 228 {{ doc.content }} 229 {% endfor %} 230 231 Question: {{ query }} 232 Answer: 233 """ 234 p.run({ 235 "prompt_builder": { 236 "documents": documents, 237 "query": question, 238 "template": new_template, 239 }, 240 }) 241 ``` 242 To replace the variables in the default template when testing your prompt, 243 pass the new variables in the `variables` parameter. 244 245 #### Overwriting variables at runtime 246 247 To overwrite the values of variables, use `template_variables` during runtime: 248 ```python 249 language_template = """ 250 You are a helpful assistant. 251 Given these documents, answer the question. 252 Documents: 253 {% for doc in documents %} 254 Document {{ loop.index }}: 255 Document name: {{ doc.meta['name'] }} 256 {{ doc.content }} 257 {% endfor %} 258 259 Question: {{ query }} 260 Please provide your answer in {{ answer_language | default('English') }} 261 Answer: 262 """ 263 p.run({ 264 "prompt_builder": { 265 "documents": documents, 266 "query": question, 267 "template": language_template, 268 "template_variables": {"answer_language": "German"}, 269 }, 270 }) 271 ``` 272 Note that `language_template` introduces variable `answer_language` which is not bound to any pipeline variable. 273 If not set otherwise, it will use its default value 'English'. 274 This example overwrites its value to 'German'. 275 Use `template_variables` to overwrite pipeline variables (such as documents) as well. 276 277 <a id="prompt_builder.PromptBuilder.__init__"></a> 278 279 #### PromptBuilder.\_\_init\_\_ 280 281 ```python 282 def __init__(template: str, 283 required_variables: Optional[Union[list[str], 284 Literal["*"]]] = None, 285 variables: Optional[list[str]] = None) 286 ``` 287 288 Constructs a PromptBuilder component. 289 290 **Arguments**: 291 292 - `template`: A prompt template that uses Jinja2 syntax to add variables. For example: 293 `"Summarize this document: {{ documents[0].content }}\nSummary:"` 294 It's used to render the prompt. 295 The variables in the default template are input for PromptBuilder and are all optional, 296 unless explicitly specified. 297 If an optional variable is not provided, it's replaced with an empty string in the rendered prompt. 298 - `required_variables`: List variables that must be provided as input to PromptBuilder. 299 If a variable listed as required is not provided, an exception is raised. 300 If set to "*", all variables found in the prompt are required. Optional. 301 - `variables`: List input variables to use in prompt templates instead of the ones inferred from the 302 `template` parameter. For example, to use more variables during prompt engineering than the ones present 303 in the default template, you can provide them here. 304 305 <a id="prompt_builder.PromptBuilder.to_dict"></a> 306 307 #### PromptBuilder.to\_dict 308 309 ```python 310 def to_dict() -> dict[str, Any] 311 ``` 312 313 Returns a dictionary representation of the component. 314 315 **Returns**: 316 317 Serialized dictionary representation of the component. 318 319 <a id="prompt_builder.PromptBuilder.run"></a> 320 321 #### PromptBuilder.run 322 323 ```python 324 @component.output_types(prompt=str) 325 def run(template: Optional[str] = None, 326 template_variables: Optional[dict[str, Any]] = None, 327 **kwargs) 328 ``` 329 330 Renders the prompt template with the provided variables. 331 332 It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs. 333 In order to overwrite the default template, you can set the `template` parameter. 334 In order to overwrite pipeline kwargs, you can set the `template_variables` parameter. 335 336 **Arguments**: 337 338 - `template`: An optional string template to overwrite PromptBuilder's default template. If None, the default template 339 provided at initialization is used. 340 - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables. 341 - `kwargs`: Pipeline variables used for rendering the prompt. 342 343 **Raises**: 344 345 - `ValueError`: If any of the required template variables is not provided. 346 347 **Returns**: 348 349 A dictionary with the following keys: 350 - `prompt`: The updated prompt text after rendering the prompt template. 351 352 <a id="chat_prompt_builder"></a> 353 354 ## Module chat\_prompt\_builder 355 356 <a id="chat_prompt_builder.ChatPromptBuilder"></a> 357 358 ### ChatPromptBuilder 359 360 Renders a chat prompt from a template using Jinja2 syntax. 361 362 A template can be a list of `ChatMessage` objects, or a special string, as shown in the usage examples. 363 364 It constructs prompts using static or dynamic templates, which you can update for each pipeline run. 365 366 Template variables in the template are optional unless specified otherwise. 367 If an optional variable isn't provided, it defaults to an empty string. Use `variable` and `required_variables` 368 to define input types and required variables. 369 370 ### Usage examples 371 372 #### Static ChatMessage prompt template 373 374 ```python 375 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 376 builder = ChatPromptBuilder(template=template) 377 builder.run(target_language="spanish", snippet="I can't speak spanish.") 378 ``` 379 380 #### Overriding static ChatMessage template at runtime 381 382 ```python 383 template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")] 384 builder = ChatPromptBuilder(template=template) 385 builder.run(target_language="spanish", snippet="I can't speak spanish.") 386 387 msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:" 388 summary_template = [ChatMessage.from_user(msg)] 389 builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template) 390 ``` 391 392 #### Dynamic ChatMessage prompt template 393 394 ```python 395 from haystack.components.builders import ChatPromptBuilder 396 from haystack.components.generators.chat import OpenAIChatGenerator 397 from haystack.dataclasses import ChatMessage 398 from haystack import Pipeline 399 from haystack.utils import Secret 400 401 # no parameter init, we don't use any runtime template variables 402 prompt_builder = ChatPromptBuilder() 403 llm = OpenAIChatGenerator(api_key=Secret.from_token("<your-api-key>")) 404 405 pipe = Pipeline() 406 pipe.add_component("prompt_builder", prompt_builder) 407 pipe.add_component("llm", llm) 408 pipe.connect("prompt_builder.prompt", "llm.messages") 409 410 location = "Berlin" 411 language = "English" 412 system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}") 413 messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")] 414 415 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language}, 416 "template": messages}}) 417 print(res) 418 419 >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 420 "Berlin is the capital city of Germany and one of the most vibrant 421 and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic 422 capital of Germany!")], _name=None, _meta={'model': 'gpt-5-mini', 423 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens': 424 708}})]}} 425 426 messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next 427 {{day_count}} days?")] 428 429 res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"}, 430 "template": messages}}) 431 432 print(res) 433 >> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 434 "Here is the weather forecast for Berlin in the next 5 435 days:\n\nDay 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates 436 closer to your visit.")], _name=None, _meta={'model': 'gpt-5-mini', 437 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201, 438 'total_tokens': 238}})]}} 439 ``` 440 441 #### String prompt template 442 ```python 443 from haystack.components.builders import ChatPromptBuilder 444 from haystack.dataclasses.image_content import ImageContent 445 446 template = """ 447 {% message role="system" %} 448 You are a helpful assistant. 449 {% endmessage %} 450 451 {% message role="user" %} 452 Hello! I am {{user_name}}. What's the difference between the following images? 453 {% for image in images %} 454 {{ image | templatize_part }} 455 {% endfor %} 456 {% endmessage %} 457 """ 458 459 images = [ImageContent.from_file_path("apple.jpg"), ImageContent.from_file_path("orange.jpg")] 460 461 builder = ChatPromptBuilder(template=template) 462 builder.run(user_name="John", images=images) 463 ``` 464 465 <a id="chat_prompt_builder.ChatPromptBuilder.__init__"></a> 466 467 #### ChatPromptBuilder.\_\_init\_\_ 468 469 ```python 470 def __init__(template: Optional[Union[list[ChatMessage], str]] = None, 471 required_variables: Optional[Union[list[str], 472 Literal["*"]]] = None, 473 variables: Optional[list[str]] = None) 474 ``` 475 476 Constructs a ChatPromptBuilder component. 477 478 **Arguments**: 479 480 - `template`: A list of `ChatMessage` objects or a string template. The component looks for Jinja2 template syntax and 481 renders the prompt with the provided variables. Provide the template in either 482 the `init` method` or the `run` method. 483 - `required_variables`: List variables that must be provided as input to ChatPromptBuilder. 484 If a variable listed as required is not provided, an exception is raised. 485 If set to "*", all variables found in the prompt are required. Optional. 486 - `variables`: List input variables to use in prompt templates instead of the ones inferred from the 487 `template` parameter. For example, to use more variables during prompt engineering than the ones present 488 in the default template, you can provide them here. 489 490 <a id="chat_prompt_builder.ChatPromptBuilder.run"></a> 491 492 #### ChatPromptBuilder.run 493 494 ```python 495 @component.output_types(prompt=list[ChatMessage]) 496 def run(template: Optional[Union[list[ChatMessage], str]] = None, 497 template_variables: Optional[dict[str, Any]] = None, 498 **kwargs) 499 ``` 500 501 Renders the prompt template with the provided variables. 502 503 It applies the template variables to render the final prompt. You can provide variables with pipeline kwargs. 504 To overwrite the default template, you can set the `template` parameter. 505 To overwrite pipeline kwargs, you can set the `template_variables` parameter. 506 507 **Arguments**: 508 509 - `template`: An optional list of `ChatMessage` objects or string template to overwrite ChatPromptBuilder's default 510 template. 511 If `None`, the default template provided at initialization is used. 512 - `template_variables`: An optional dictionary of template variables to overwrite the pipeline variables. 513 - `kwargs`: Pipeline variables used for rendering the prompt. 514 515 **Raises**: 516 517 - `ValueError`: If `chat_messages` is empty or contains elements that are not instances of `ChatMessage`. 518 519 **Returns**: 520 521 A dictionary with the following keys: 522 - `prompt`: The updated list of `ChatMessage` objects after rendering the templates. 523 524 <a id="chat_prompt_builder.ChatPromptBuilder.to_dict"></a> 525 526 #### ChatPromptBuilder.to\_dict 527 528 ```python 529 def to_dict() -> dict[str, Any] 530 ``` 531 532 Returns a dictionary representation of the component. 533 534 **Returns**: 535 536 Serialized dictionary representation of the component. 537 538 <a id="chat_prompt_builder.ChatPromptBuilder.from_dict"></a> 539 540 #### ChatPromptBuilder.from\_dict 541 542 ```python 543 @classmethod 544 def from_dict(cls, data: dict[str, Any]) -> "ChatPromptBuilder" 545 ``` 546 547 Deserialize this component from a dictionary. 548 549 **Arguments**: 550 551 - `data`: The dictionary to deserialize and create the component. 552 553 **Returns**: 554 555 The deserialized component. 556