data_classes_api.md
1 --- 2 title: "Data Classes" 3 id: data-classes-api 4 description: "Core classes that carry data through the system." 5 slug: "/data-classes-api" 6 --- 7 8 9 ## answer 10 11 ### ExtractedAnswer 12 13 #### to_dict 14 15 ```python 16 to_dict() -> dict[str, Any] 17 ``` 18 19 Serialize the object to a dictionary. 20 21 **Returns:** 22 23 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the object. 24 25 #### from_dict 26 27 ```python 28 from_dict(data: dict[str, Any]) -> ExtractedAnswer 29 ``` 30 31 Deserialize the object from a dictionary. 32 33 **Parameters:** 34 35 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the object. 36 37 **Returns:** 38 39 - <code>ExtractedAnswer</code> – Deserialized object. 40 41 ### GeneratedAnswer 42 43 #### to_dict 44 45 ```python 46 to_dict() -> dict[str, Any] 47 ``` 48 49 Serialize the object to a dictionary. 50 51 **Returns:** 52 53 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the object. 54 55 #### from_dict 56 57 ```python 58 from_dict(data: dict[str, Any]) -> GeneratedAnswer 59 ``` 60 61 Deserialize the object from a dictionary. 62 63 **Parameters:** 64 65 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the object. 66 67 **Returns:** 68 69 - <code>GeneratedAnswer</code> – Deserialized object. 70 71 ## breakpoints 72 73 ### Breakpoint 74 75 A dataclass to hold a breakpoint for a component. 76 77 **Parameters:** 78 79 - **component_name** (<code>str</code>) – The name of the component where the breakpoint is set. 80 - **visit_count** (<code>int</code>) – The number of times the component must be visited before the breakpoint is triggered. 81 - **snapshot_file_path** (<code>str | None</code>) – Optional path to store a snapshot of the pipeline when the breakpoint is hit. 82 This is useful for debugging purposes, allowing you to inspect the state of the pipeline at the time of the 83 breakpoint and to resume execution from that point. 84 85 #### to_dict 86 87 ```python 88 to_dict() -> dict[str, Any] 89 ``` 90 91 Convert the Breakpoint to a dictionary representation. 92 93 **Returns:** 94 95 - <code>dict\[str, Any\]</code> – A dictionary containing the component name, visit count, and debug path. 96 97 #### from_dict 98 99 ```python 100 from_dict(data: dict) -> Breakpoint 101 ``` 102 103 Populate the Breakpoint from a dictionary representation. 104 105 **Parameters:** 106 107 - **data** (<code>dict</code>) – A dictionary containing the component name, visit count, and debug path. 108 109 **Returns:** 110 111 - <code>Breakpoint</code> – An instance of Breakpoint. 112 113 ### ToolBreakpoint 114 115 Bases: <code>Breakpoint</code> 116 117 A dataclass representing a breakpoint specific to tools used within an Agent component. 118 119 Inherits from Breakpoint and adds the ability to target individual tools. If `tool_name` is None, 120 the breakpoint applies to all tools within the Agent component. 121 122 **Parameters:** 123 124 - **tool_name** (<code>str | None</code>) – The name of the tool to target within the Agent component. If None, applies to all tools. 125 126 ### AgentBreakpoint 127 128 A dataclass representing a breakpoint tied to an Agent’s execution. 129 130 This allows for debugging either a specific component (e.g., the chat generator) or a tool used by the agent. 131 It enforces constraints on which component names are valid for each breakpoint type. 132 133 **Parameters:** 134 135 - **agent_name** (<code>str</code>) – The name of the agent component in a pipeline where the breakpoint is set. 136 - **break_point** (<code>Breakpoint | ToolBreakpoint</code>) – An instance of Breakpoint or ToolBreakpoint indicating where to break execution. 137 138 **Raises:** 139 140 - <code>ValueError</code> – If the component_name is invalid for the given breakpoint type: 141 - Breakpoint must have component_name='chat_generator'. 142 - ToolBreakpoint must have component_name='tool_invoker'. 143 144 #### to_dict 145 146 ```python 147 to_dict() -> dict[str, Any] 148 ``` 149 150 Convert the AgentBreakpoint to a dictionary representation. 151 152 **Returns:** 153 154 - <code>dict\[str, Any\]</code> – A dictionary containing the agent name and the breakpoint details. 155 156 #### from_dict 157 158 ```python 159 from_dict(data: dict) -> AgentBreakpoint 160 ``` 161 162 Populate the AgentBreakpoint from a dictionary representation. 163 164 **Parameters:** 165 166 - **data** (<code>dict</code>) – A dictionary containing the agent name and the breakpoint details. 167 168 **Returns:** 169 170 - <code>AgentBreakpoint</code> – An instance of AgentBreakpoint. 171 172 ### AgentSnapshot 173 174 #### to_dict 175 176 ```python 177 to_dict() -> dict[str, Any] 178 ``` 179 180 Convert the AgentSnapshot to a dictionary representation. 181 182 **Returns:** 183 184 - <code>dict\[str, Any\]</code> – A dictionary containing the agent state, timestamp, and breakpoint. 185 186 #### from_dict 187 188 ```python 189 from_dict(data: dict) -> AgentSnapshot 190 ``` 191 192 Populate the AgentSnapshot from a dictionary representation. 193 194 **Parameters:** 195 196 - **data** (<code>dict</code>) – A dictionary containing the agent state, timestamp, and breakpoint. 197 198 **Returns:** 199 200 - <code>AgentSnapshot</code> – An instance of AgentSnapshot. 201 202 ### PipelineState 203 204 A dataclass to hold the state of the pipeline at a specific point in time. 205 206 **Parameters:** 207 208 - **component_visits** (<code>dict\[str, int\]</code>) – A dictionary mapping component names to their visit counts. 209 - **inputs** (<code>dict\[str, Any\]</code>) – The inputs processed by the pipeline at the time of the snapshot. 210 - **pipeline_outputs** (<code>dict\[str, Any\]</code>) – Dictionary containing the final outputs of the pipeline up to the breakpoint. 211 212 #### to_dict 213 214 ```python 215 to_dict() -> dict[str, Any] 216 ``` 217 218 Convert the PipelineState to a dictionary representation. 219 220 **Returns:** 221 222 - <code>dict\[str, Any\]</code> – A dictionary containing the inputs, component visits, 223 and pipeline outputs. 224 225 #### from_dict 226 227 ```python 228 from_dict(data: dict) -> PipelineState 229 ``` 230 231 Populate the PipelineState from a dictionary representation. 232 233 **Parameters:** 234 235 - **data** (<code>dict</code>) – A dictionary containing the inputs, component visits, 236 and pipeline outputs. 237 238 **Returns:** 239 240 - <code>PipelineState</code> – An instance of PipelineState. 241 242 ### PipelineSnapshot 243 244 A dataclass to hold a snapshot of the pipeline at a specific point in time. 245 246 **Parameters:** 247 248 - **original_input_data** (<code>dict\[str, Any\]</code>) – The original input data provided to the pipeline. 249 - **ordered_component_names** (<code>list\[str\]</code>) – A list of component names in the order they were visited. 250 - **pipeline_state** (<code>PipelineState</code>) – The state of the pipeline at the time of the snapshot. 251 - **break_point** (<code>AgentBreakpoint | Breakpoint</code>) – The breakpoint that triggered the snapshot. 252 - **agent_snapshot** (<code>AgentSnapshot | None</code>) – Optional agent snapshot if the breakpoint is an agent breakpoint. 253 - **timestamp** (<code>datetime | None</code>) – A timestamp indicating when the snapshot was taken. 254 - **include_outputs_from** (<code>set\[str\]</code>) – Set of component names whose outputs should be included in the pipeline results. 255 256 #### to_dict 257 258 ```python 259 to_dict() -> dict[str, Any] 260 ``` 261 262 Convert the PipelineSnapshot to a dictionary representation. 263 264 **Returns:** 265 266 - <code>dict\[str, Any\]</code> – A dictionary containing the pipeline state, timestamp, breakpoint, agent snapshot, original input data, 267 ordered component names, include_outputs_from, and pipeline outputs. 268 269 #### from_dict 270 271 ```python 272 from_dict(data: dict) -> PipelineSnapshot 273 ``` 274 275 Populate the PipelineSnapshot from a dictionary representation. 276 277 **Parameters:** 278 279 - **data** (<code>dict</code>) – A dictionary containing the pipeline state, timestamp, breakpoint, agent snapshot, original input 280 data, ordered component names, include_outputs_from, and pipeline outputs. 281 282 ## byte_stream 283 284 ### ByteStream 285 286 Base data class representing a binary object in the Haystack API. 287 288 **Parameters:** 289 290 - **data** (<code>bytes</code>) – The binary data stored in Bytestream. 291 - **meta** (<code>dict\[str, Any\]</code>) – Additional metadata to be stored with the ByteStream. 292 - **mime_type** (<code>str | None</code>) – The mime type of the binary data. 293 294 #### to_file 295 296 ```python 297 to_file(destination_path: Path) -> None 298 ``` 299 300 Write the ByteStream to a file. Note: the metadata will be lost. 301 302 **Parameters:** 303 304 - **destination_path** (<code>Path</code>) – The path to write the ByteStream to. 305 306 #### from_file_path 307 308 ```python 309 from_file_path( 310 filepath: Path, 311 mime_type: str | None = None, 312 meta: dict[str, Any] | None = None, 313 guess_mime_type: bool = False, 314 ) -> ByteStream 315 ``` 316 317 Create a ByteStream from the contents read from a file. 318 319 **Parameters:** 320 321 - **filepath** (<code>Path</code>) – A valid path to a file. 322 - **mime_type** (<code>str | None</code>) – The mime type of the file. 323 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata to be stored with the ByteStream. 324 - **guess_mime_type** (<code>bool</code>) – Whether to guess the mime type from the file. 325 326 #### from_string 327 328 ```python 329 from_string( 330 text: str, 331 encoding: str = "utf-8", 332 mime_type: str | None = None, 333 meta: dict[str, Any] | None = None, 334 ) -> ByteStream 335 ``` 336 337 Create a ByteStream encoding a string. 338 339 **Parameters:** 340 341 - **text** (<code>str</code>) – The string to encode 342 - **encoding** (<code>str</code>) – The encoding used to convert the string into bytes 343 - **mime_type** (<code>str | None</code>) – The mime type of the file. 344 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata to be stored with the ByteStream. 345 346 #### to_string 347 348 ```python 349 to_string(encoding: str = 'utf-8') -> str 350 ``` 351 352 Convert the ByteStream to a string, metadata will not be included. 353 354 **Parameters:** 355 356 - **encoding** (<code>str</code>) – The encoding used to convert the bytes to a string. Defaults to "utf-8". 357 358 **Returns:** 359 360 - <code>str</code> – The string representation of the ByteStream. 361 362 **Raises:** 363 364 - <code>UnicodeDecodeError</code> – If the ByteStream data cannot be decoded with the specified encoding. 365 366 #### to_dict 367 368 ```python 369 to_dict() -> dict[str, Any] 370 ``` 371 372 Convert the ByteStream to a dictionary representation. 373 374 **Returns:** 375 376 - <code>dict\[str, Any\]</code> – A dictionary with keys 'data', 'meta', and 'mime_type'. 377 378 #### from_dict 379 380 ```python 381 from_dict(data: dict[str, Any]) -> ByteStream 382 ``` 383 384 Create a ByteStream from a dictionary representation. 385 386 **Parameters:** 387 388 - **data** (<code>dict\[str, Any\]</code>) – A dictionary with keys 'data', 'meta', and 'mime_type'. 389 390 **Returns:** 391 392 - <code>ByteStream</code> – A ByteStream instance. 393 394 ## chat_message 395 396 ### ChatRole 397 398 Bases: <code>str</code>, <code>Enum</code> 399 400 Enumeration representing the roles within a chat. 401 402 #### from_str 403 404 ```python 405 from_str(string: str) -> ChatRole 406 ``` 407 408 Convert a string to a ChatRole enum. 409 410 ### TextContent 411 412 The textual content of a chat message. 413 414 **Parameters:** 415 416 - **text** (<code>str</code>) – The text content of the message. 417 418 #### to_dict 419 420 ```python 421 to_dict() -> dict[str, Any] 422 ``` 423 424 Convert TextContent into a dictionary. 425 426 #### from_dict 427 428 ```python 429 from_dict(data: dict[str, Any]) -> TextContent 430 ``` 431 432 Create a TextContent from a dictionary. 433 434 ### ToolCall 435 436 Represents a Tool call prepared by the model, usually contained in an assistant message. 437 438 **Parameters:** 439 440 - **id** (<code>str | None</code>) – The ID of the Tool call. 441 - **tool_name** (<code>str</code>) – The name of the Tool to call. 442 - **arguments** (<code>dict\[str, Any\]</code>) – The arguments to call the Tool with. 443 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the Tool call. Use to store provider-specific 444 information. To avoid serialization issues, values should be JSON serializable. 445 446 #### to_dict 447 448 ```python 449 to_dict() -> dict[str, Any] 450 ``` 451 452 Convert ToolCall into a dictionary. 453 454 **Returns:** 455 456 - <code>dict\[str, Any\]</code> – A dictionary with keys 'tool_name', 'arguments', 'id', and 'extra'. 457 458 #### from_dict 459 460 ```python 461 from_dict(data: dict[str, Any]) -> ToolCall 462 ``` 463 464 Creates a new ToolCall object from a dictionary. 465 466 **Parameters:** 467 468 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ToolCall object. 469 470 **Returns:** 471 472 - <code>ToolCall</code> – The created object. 473 474 ### ToolCallResult 475 476 Represents the result of a Tool invocation. 477 478 **Parameters:** 479 480 - **result** (<code>ToolCallResultContentT</code>) – The result of the Tool invocation. 481 - **origin** (<code>ToolCall</code>) – The Tool call that produced this result. 482 - **error** (<code>bool</code>) – Whether the Tool invocation resulted in an error. 483 484 #### to_dict 485 486 ```python 487 to_dict() -> dict[str, Any] 488 ``` 489 490 Converts ToolCallResult into a dictionary. 491 492 **Returns:** 493 494 - <code>dict\[str, Any\]</code> – A dictionary with keys 'result', 'origin', and 'error'. 495 496 #### from_dict 497 498 ```python 499 from_dict(data: dict[str, Any]) -> ToolCallResult 500 ``` 501 502 Creates a ToolCallResult from a dictionary. 503 504 **Parameters:** 505 506 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ToolCallResult object. 507 508 **Returns:** 509 510 - <code>ToolCallResult</code> – The created object. 511 512 ### ReasoningContent 513 514 Represents the optional reasoning content prepared by the model, usually contained in an assistant message. 515 516 **Parameters:** 517 518 - **reasoning_text** (<code>str</code>) – The reasoning text produced by the model. 519 - **extra** (<code>dict\[str, Any\]</code>) – Dictionary of extra information about the reasoning content. Use to store provider-specific 520 information. To avoid serialization issues, values should be JSON serializable. 521 522 #### to_dict 523 524 ```python 525 to_dict() -> dict[str, Any] 526 ``` 527 528 Convert ReasoningContent into a dictionary. 529 530 **Returns:** 531 532 - <code>dict\[str, Any\]</code> – A dictionary with keys 'reasoning_text', and 'extra'. 533 534 #### from_dict 535 536 ```python 537 from_dict(data: dict[str, Any]) -> ReasoningContent 538 ``` 539 540 Creates a new ReasoningContent object from a dictionary. 541 542 **Parameters:** 543 544 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ReasoningContent object. 545 546 **Returns:** 547 548 - <code>ReasoningContent</code> – The created object. 549 550 ### ChatMessage 551 552 Represents a message in a LLM chat conversation. 553 554 Use the `from_assistant`, `from_user`, `from_system`, and `from_tool` class methods to create a ChatMessage. 555 556 #### role 557 558 ```python 559 role: ChatRole 560 ``` 561 562 Returns the role of the entity sending the message. 563 564 #### meta 565 566 ```python 567 meta: dict[str, Any] 568 ``` 569 570 Returns the metadata associated with the message. 571 572 #### name 573 574 ```python 575 name: str | None 576 ``` 577 578 Returns the name associated with the message. 579 580 #### texts 581 582 ```python 583 texts: list[str] 584 ``` 585 586 Returns the list of all texts contained in the message. 587 588 #### text 589 590 ```python 591 text: str | None 592 ``` 593 594 Returns the first text contained in the message. 595 596 #### tool_calls 597 598 ```python 599 tool_calls: list[ToolCall] 600 ``` 601 602 Returns the list of all Tool calls contained in the message. 603 604 #### tool_call 605 606 ```python 607 tool_call: ToolCall | None 608 ``` 609 610 Returns the first Tool call contained in the message. 611 612 #### tool_call_results 613 614 ```python 615 tool_call_results: list[ToolCallResult] 616 ``` 617 618 Returns the list of all Tool call results contained in the message. 619 620 #### tool_call_result 621 622 ```python 623 tool_call_result: ToolCallResult | None 624 ``` 625 626 Returns the first Tool call result contained in the message. 627 628 #### images 629 630 ```python 631 images: list[ImageContent] 632 ``` 633 634 Returns the list of all images contained in the message. 635 636 #### image 637 638 ```python 639 image: ImageContent | None 640 ``` 641 642 Returns the first image contained in the message. 643 644 #### files 645 646 ```python 647 files: list[FileContent] 648 ``` 649 650 Returns the list of all files contained in the message. 651 652 #### file 653 654 ```python 655 file: FileContent | None 656 ``` 657 658 Returns the first file contained in the message. 659 660 #### reasonings 661 662 ```python 663 reasonings: list[ReasoningContent] 664 ``` 665 666 Returns the list of all reasoning contents contained in the message. 667 668 #### reasoning 669 670 ```python 671 reasoning: ReasoningContent | None 672 ``` 673 674 Returns the first reasoning content contained in the message. 675 676 #### is_from 677 678 ```python 679 is_from(role: ChatRole | str) -> bool 680 ``` 681 682 Check if the message is from a specific role. 683 684 **Parameters:** 685 686 - **role** (<code>ChatRole | str</code>) – The role to check against. 687 688 **Returns:** 689 690 - <code>bool</code> – True if the message is from the specified role, False otherwise. 691 692 #### from_user 693 694 ```python 695 from_user( 696 text: str | None = None, 697 meta: dict[str, Any] | None = None, 698 name: str | None = None, 699 *, 700 content_parts: ( 701 Sequence[TextContent | str | ImageContent | FileContent] | None 702 ) = None 703 ) -> ChatMessage 704 ``` 705 706 Create a message from the user. 707 708 **Parameters:** 709 710 - **text** (<code>str | None</code>) – The text content of the message. Specify this or content_parts. 711 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 712 - **name** (<code>str | None</code>) – An optional name for the participant. This field is only supported by OpenAI. 713 - **content_parts** (<code>Sequence\[TextContent | str | ImageContent | FileContent\] | None</code>) – A list of content parts to include in the message. Specify this or text. 714 715 **Returns:** 716 717 - <code>ChatMessage</code> – A new ChatMessage instance. 718 719 **Raises:** 720 721 - <code>ValueError</code> – If neither or both of text and content_parts are provided, or if content_parts is empty. 722 - <code>TypeError</code> – If a content part is not a str, TextContent, ImageContent, or FileContent. 723 724 #### from_system 725 726 ```python 727 from_system( 728 text: str, meta: dict[str, Any] | None = None, name: str | None = None 729 ) -> ChatMessage 730 ``` 731 732 Create a message from the system. 733 734 **Parameters:** 735 736 - **text** (<code>str</code>) – The text content of the message. 737 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 738 - **name** (<code>str | None</code>) – An optional name for the participant. This field is only supported by OpenAI. 739 740 **Returns:** 741 742 - <code>ChatMessage</code> – A new ChatMessage instance. 743 744 #### from_assistant 745 746 ```python 747 from_assistant( 748 text: str | None = None, 749 meta: dict[str, Any] | None = None, 750 name: str | None = None, 751 tool_calls: list[ToolCall] | None = None, 752 *, 753 reasoning: str | ReasoningContent | None = None 754 ) -> ChatMessage 755 ``` 756 757 Create a message from the assistant. 758 759 **Parameters:** 760 761 - **text** (<code>str | None</code>) – The text content of the message. 762 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 763 - **name** (<code>str | None</code>) – An optional name for the participant. This field is only supported by OpenAI. 764 - **tool_calls** (<code>list\[ToolCall\] | None</code>) – The Tool calls to include in the message. 765 - **reasoning** (<code>str | ReasoningContent | None</code>) – The reasoning content to include in the message. 766 767 **Returns:** 768 769 - <code>ChatMessage</code> – A new ChatMessage instance. 770 771 **Raises:** 772 773 - <code>TypeError</code> – If `reasoning` is not a string or ReasoningContent object. 774 775 #### from_tool 776 777 ```python 778 from_tool( 779 tool_result: ToolCallResultContentT, 780 origin: ToolCall, 781 error: bool = False, 782 meta: dict[str, Any] | None = None, 783 ) -> ChatMessage 784 ``` 785 786 Create a message from a Tool. 787 788 **Parameters:** 789 790 - **tool_result** (<code>ToolCallResultContentT</code>) – The result of the Tool invocation. 791 - **origin** (<code>ToolCall</code>) – The Tool call that produced this result. 792 - **error** (<code>bool</code>) – Whether the Tool invocation resulted in an error. 793 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 794 795 **Returns:** 796 797 - <code>ChatMessage</code> – A new ChatMessage instance. 798 799 #### to_dict 800 801 ```python 802 to_dict() -> dict[str, Any] 803 ``` 804 805 Converts ChatMessage into a dictionary. 806 807 **Returns:** 808 809 - <code>dict\[str, Any\]</code> – Serialized version of the object. 810 811 #### from_dict 812 813 ```python 814 from_dict(data: dict[str, Any]) -> ChatMessage 815 ``` 816 817 Creates a new ChatMessage object from a dictionary. 818 819 **Parameters:** 820 821 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ChatMessage object. 822 823 **Returns:** 824 825 - <code>ChatMessage</code> – The created object. 826 827 **Raises:** 828 829 - <code>ValueError</code> – If the `role` field is missing from the dictionary. 830 - <code>TypeError</code> – If the `content` field is not a list or string. 831 832 #### to_openai_dict_format 833 834 ```python 835 to_openai_dict_format(require_tool_call_ids: bool = True) -> dict[str, Any] 836 ``` 837 838 Convert a ChatMessage to the dictionary format expected by OpenAI's Chat Completions API. 839 840 **Parameters:** 841 842 - **require_tool_call_ids** (<code>bool</code>) – If True (default), enforces that each Tool Call includes a non-null `id` attribute. 843 Set to False to allow Tool Calls without `id`, which may be suitable for shallow OpenAI-compatible APIs. 844 845 **Returns:** 846 847 - <code>dict\[str, Any\]</code> – The ChatMessage in the format expected by OpenAI's Chat Completions API. 848 849 **Raises:** 850 851 - <code>ValueError</code> – If the message format is invalid, or if `require_tool_call_ids` is True and any Tool Call is missing an 852 `id` attribute. 853 854 #### from_openai_dict_format 855 856 ```python 857 from_openai_dict_format(message: dict[str, Any]) -> ChatMessage 858 ``` 859 860 Create a ChatMessage from a dictionary in the format expected by OpenAI's Chat API. 861 862 NOTE: While OpenAI's API requires `tool_call_id` in both tool calls and tool messages, this method 863 accepts messages without it to support shallow OpenAI-compatible APIs. 864 If you plan to use the resulting ChatMessage with OpenAI, you must include `tool_call_id` or you'll 865 encounter validation errors. 866 867 **Parameters:** 868 869 - **message** (<code>dict\[str, Any\]</code>) – The OpenAI dictionary to build the ChatMessage object. 870 871 **Returns:** 872 873 - <code>ChatMessage</code> – The created ChatMessage object. 874 875 **Raises:** 876 877 - <code>ValueError</code> – If the message dictionary is missing required fields. 878 879 ## document 880 881 ### Document 882 883 Base data class containing some data to be queried. 884 885 Can contain text snippets and file paths to images or audios. Documents can be sorted by score and saved 886 to/from dictionary and JSON. 887 888 **Parameters:** 889 890 - **id** (<code>str</code>) – Unique identifier for the document. When not set, it's generated based on the Document fields' values. 891 - **content** (<code>str | None</code>) – Text of the document, if the document contains text. 892 - **blob** (<code>ByteStream | None</code>) – Binary data associated with the document, if the document has any binary data associated with it. 893 - **meta** (<code>dict\[str, Any\]</code>) – Additional custom metadata for the document. Must be JSON-serializable. 894 - **score** (<code>float | None</code>) – Score of the document. Used for ranking, usually assigned by retrievers. 895 - **embedding** (<code>list\[float\] | None</code>) – dense vector representation of the document. 896 - **sparse_embedding** (<code>SparseEmbedding | None</code>) – sparse vector representation of the document. 897 898 #### to_dict 899 900 ```python 901 to_dict(flatten: bool = True) -> dict[str, Any] 902 ``` 903 904 Converts Document into a dictionary. 905 906 `blob` field is converted to a JSON-serializable type. 907 908 **Parameters:** 909 910 - **flatten** (<code>bool</code>) – Whether to flatten `meta` field or not. Defaults to `True` to be backward-compatible with Haystack 1.x. 911 912 #### from_dict 913 914 ```python 915 from_dict(data: dict[str, Any]) -> Document 916 ``` 917 918 Creates a new Document object from a dictionary. 919 920 The `blob` field is converted to its original type. 921 922 #### content_type 923 924 ```python 925 content_type 926 ``` 927 928 Returns the type of the content for the document. 929 930 This is necessary to keep backward compatibility with 1.x. 931 932 ## file_content 933 934 ### FileContent 935 936 The file content of a chat message. 937 938 **Parameters:** 939 940 - **base64_data** (<code>str</code>) – A base64 string representing the file. 941 - **mime_type** (<code>str | None</code>) – The MIME type of the file (e.g. "application/pdf"). 942 Providing this value is recommended, as most LLM providers require it. 943 If not provided, the MIME type is guessed from the base64 string, which can be slow and not always reliable. 944 - **filename** (<code>str | None</code>) – Optional filename of the file. Some LLM providers use this information. 945 - **extra** (<code>dict\[str, Any\]</code>) – Dictionary of extra information about the file. Can be used to store provider-specific information. 946 To avoid serialization issues, values should be JSON serializable. 947 - **validation** (<code>bool</code>) – If True (default), a validation process is performed: 948 - Check whether the base64 string is valid; 949 - Guess the MIME type if not provided. 950 Set to False to skip validation and speed up initialization. 951 952 #### to_dict 953 954 ```python 955 to_dict() -> dict[str, Any] 956 ``` 957 958 Convert FileContent into a dictionary. 959 960 #### from_dict 961 962 ```python 963 from_dict(data: dict[str, Any]) -> FileContent 964 ``` 965 966 Create an FileContent from a dictionary. 967 968 #### from_file_path 969 970 ```python 971 from_file_path( 972 file_path: str | Path, 973 *, 974 filename: str | None = None, 975 extra: dict[str, Any] | None = None 976 ) -> FileContent 977 ``` 978 979 Create an FileContent object from a file path. 980 981 **Parameters:** 982 983 - **file_path** (<code>str | Path</code>) – The path to the file. 984 - **filename** (<code>str | None</code>) – Optional file name. Some LLM providers use this information. If not provided, the filename is extracted 985 from the file path. 986 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the file. Can be used to store provider-specific information. 987 To avoid serialization issues, values should be JSON serializable. 988 989 **Returns:** 990 991 - <code>FileContent</code> – An FileContent object. 992 993 #### from_url 994 995 ```python 996 from_url( 997 url: str, 998 *, 999 retry_attempts: int = 2, 1000 timeout: int = 10, 1001 filename: str | None = None, 1002 extra: dict[str, Any] | None = None 1003 ) -> FileContent 1004 ``` 1005 1006 Create an FileContent object from a URL. The file is downloaded and converted to a base64 string. 1007 1008 **Parameters:** 1009 1010 - **url** (<code>str</code>) – The URL of the file. 1011 - **retry_attempts** (<code>int</code>) – The number of times to retry to fetch the URL's content. 1012 - **timeout** (<code>int</code>) – Timeout in seconds for the request. 1013 - **filename** (<code>str | None</code>) – Optional filename of the file. Some LLM providers use this information. If not provided, the filename is 1014 extracted from the URL. 1015 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the file. Can be used to store provider-specific information. 1016 To avoid serialization issues, values should be JSON serializable. 1017 1018 **Returns:** 1019 1020 - <code>FileContent</code> – An FileContent object. 1021 1022 ## image_content 1023 1024 ### ImageContent 1025 1026 The image content of a chat message. 1027 1028 **Parameters:** 1029 1030 - **base64_image** (<code>str</code>) – A base64 string representing the image. 1031 - **mime_type** (<code>str | None</code>) – The MIME type of the image (e.g. "image/png", "image/jpeg"). 1032 Providing this value is recommended, as most LLM providers require it. 1033 If not provided, the MIME type is guessed from the base64 string, which can be slow and not always reliable. 1034 - **detail** (<code>Literal['auto', 'high', 'low'] | None</code>) – Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 1035 - **meta** (<code>dict\[str, Any\]</code>) – Optional metadata for the image. 1036 - **validation** (<code>bool</code>) – If True (default), a validation process is performed: 1037 - Check whether the base64 string is valid; 1038 - Guess the MIME type if not provided; 1039 - Check if the MIME type is a valid image MIME type. 1040 Set to False to skip validation and speed up initialization. 1041 1042 #### show 1043 1044 ```python 1045 show() -> None 1046 ``` 1047 1048 Shows the image. 1049 1050 #### to_dict 1051 1052 ```python 1053 to_dict() -> dict[str, Any] 1054 ``` 1055 1056 Convert ImageContent into a dictionary. 1057 1058 #### from_dict 1059 1060 ```python 1061 from_dict(data: dict[str, Any]) -> ImageContent 1062 ``` 1063 1064 Create an ImageContent from a dictionary. 1065 1066 #### from_file_path 1067 1068 ```python 1069 from_file_path( 1070 file_path: str | Path, 1071 *, 1072 size: tuple[int, int] | None = None, 1073 detail: Literal["auto", "high", "low"] | None = None, 1074 meta: dict[str, Any] | None = None 1075 ) -> ImageContent 1076 ``` 1077 1078 Create an ImageContent object from a file path. 1079 1080 It exposes similar functionality as the `ImageFileToImageContent` component. For PDF to ImageContent conversion, 1081 use the `PDFToImageContent` component. 1082 1083 **Parameters:** 1084 1085 - **file_path** (<code>str | Path</code>) – The path to the image file. PDF files are not supported. For PDF to ImageContent conversion, use the 1086 `PDFToImageContent` component. 1087 - **size** (<code>tuple\[int, int\] | None</code>) – If provided, resizes the image to fit within the specified dimensions (width, height) while 1088 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 1089 when working with models that have resolution constraints or when transmitting images to remote services. 1090 - **detail** (<code>Literal['auto', 'high', 'low'] | None</code>) – Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 1091 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata for the image. 1092 1093 **Returns:** 1094 1095 - <code>ImageContent</code> – An ImageContent object. 1096 1097 #### from_url 1098 1099 ```python 1100 from_url( 1101 url: str, 1102 *, 1103 retry_attempts: int = 2, 1104 timeout: int = 10, 1105 size: tuple[int, int] | None = None, 1106 detail: Literal["auto", "high", "low"] | None = None, 1107 meta: dict[str, Any] | None = None 1108 ) -> ImageContent 1109 ``` 1110 1111 Create an ImageContent object from a URL. The image is downloaded and converted to a base64 string. 1112 1113 For PDF to ImageContent conversion, use the `PDFToImageContent` component. 1114 1115 **Parameters:** 1116 1117 - **url** (<code>str</code>) – The URL of the image. PDF files are not supported. For PDF to ImageContent conversion, use the 1118 `PDFToImageContent` component. 1119 - **retry_attempts** (<code>int</code>) – The number of times to retry to fetch the URL's content. 1120 - **timeout** (<code>int</code>) – Timeout in seconds for the request. 1121 - **size** (<code>tuple\[int, int\] | None</code>) – If provided, resizes the image to fit within the specified dimensions (width, height) while 1122 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 1123 when working with models that have resolution constraints or when transmitting images to remote services. 1124 - **detail** (<code>Literal['auto', 'high', 'low'] | None</code>) – Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 1125 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata for the image. 1126 1127 **Returns:** 1128 1129 - <code>ImageContent</code> – An ImageContent object. 1130 1131 **Raises:** 1132 1133 - <code>ValueError</code> – If the URL does not point to an image or if it points to a PDF file. 1134 1135 ## sparse_embedding 1136 1137 ### SparseEmbedding 1138 1139 Class representing a sparse embedding. 1140 1141 **Parameters:** 1142 1143 - **indices** (<code>list\[int\]</code>) – List of indices of non-zero elements in the embedding. 1144 - **values** (<code>list\[float\]</code>) – List of values of non-zero elements in the embedding. 1145 1146 #### to_dict 1147 1148 ```python 1149 to_dict() -> dict[str, Any] 1150 ``` 1151 1152 Convert the SparseEmbedding object to a dictionary. 1153 1154 **Returns:** 1155 1156 - <code>dict\[str, Any\]</code> – Serialized sparse embedding. 1157 1158 #### from_dict 1159 1160 ```python 1161 from_dict(sparse_embedding_dict: dict[str, Any]) -> SparseEmbedding 1162 ``` 1163 1164 Deserializes the sparse embedding from a dictionary. 1165 1166 **Parameters:** 1167 1168 - **sparse_embedding_dict** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from. 1169 1170 **Returns:** 1171 1172 - <code>SparseEmbedding</code> – Deserialized sparse embedding. 1173 1174 ## streaming_chunk 1175 1176 ### ToolCallDelta 1177 1178 Represents a Tool call prepared by the model, usually contained in an assistant message. 1179 1180 **Parameters:** 1181 1182 - **index** (<code>int</code>) – The index of the Tool call in the list of Tool calls. 1183 - **tool_name** (<code>str | None</code>) – The name of the Tool to call. 1184 - **arguments** (<code>str | None</code>) – Either the full arguments in JSON format or a delta of the arguments. 1185 - **id** (<code>str | None</code>) – The ID of the Tool call. 1186 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the Tool call. Use to store provider-specific 1187 information. To avoid serialization issues, values should be JSON serializable. 1188 1189 #### to_dict 1190 1191 ```python 1192 to_dict() -> dict[str, Any] 1193 ``` 1194 1195 Returns a dictionary representation of the ToolCallDelta. 1196 1197 **Returns:** 1198 1199 - <code>dict\[str, Any\]</code> – A dictionary with keys 'index', 'tool_name', 'arguments', 'id', and 'extra'. 1200 1201 #### from_dict 1202 1203 ```python 1204 from_dict(data: dict[str, Any]) -> ToolCallDelta 1205 ``` 1206 1207 Creates a ToolCallDelta from a serialized representation. 1208 1209 **Parameters:** 1210 1211 - **data** (<code>dict\[str, Any\]</code>) – Dictionary containing ToolCallDelta's attributes. 1212 1213 **Returns:** 1214 1215 - <code>ToolCallDelta</code> – A ToolCallDelta instance. 1216 1217 ### ComponentInfo 1218 1219 The `ComponentInfo` class encapsulates information about a component. 1220 1221 **Parameters:** 1222 1223 - **type** (<code>str</code>) – The type of the component. 1224 - **name** (<code>str | None</code>) – The name of the component assigned when adding it to a pipeline. 1225 1226 #### from_component 1227 1228 ```python 1229 from_component(component: Component) -> ComponentInfo 1230 ``` 1231 1232 Create a `ComponentInfo` object from a `Component` instance. 1233 1234 **Parameters:** 1235 1236 - **component** (<code>Component</code>) – The `Component` instance. 1237 1238 **Returns:** 1239 1240 - <code>ComponentInfo</code> – The `ComponentInfo` object with the type and name of the given component. 1241 1242 #### to_dict 1243 1244 ```python 1245 to_dict() -> dict[str, Any] 1246 ``` 1247 1248 Returns a dictionary representation of ComponentInfo. 1249 1250 **Returns:** 1251 1252 - <code>dict\[str, Any\]</code> – A dictionary with keys 'type' and 'name'. 1253 1254 #### from_dict 1255 1256 ```python 1257 from_dict(data: dict[str, Any]) -> ComponentInfo 1258 ``` 1259 1260 Creates a ComponentInfo from a serialized representation. 1261 1262 **Parameters:** 1263 1264 - **data** (<code>dict\[str, Any\]</code>) – Dictionary containing ComponentInfo's attributes. 1265 1266 **Returns:** 1267 1268 - <code>ComponentInfo</code> – A ComponentInfo instance. 1269 1270 ### StreamingChunk 1271 1272 The `StreamingChunk` class encapsulates a segment of streamed content along with associated metadata. 1273 1274 This structure facilitates the handling and processing of streamed data in a systematic manner. 1275 1276 **Parameters:** 1277 1278 - **content** (<code>str</code>) – The content of the message chunk as a string. 1279 - **meta** (<code>dict\[str, Any\]</code>) – A dictionary containing metadata related to the message chunk. 1280 - **component_info** (<code>ComponentInfo | None</code>) – A `ComponentInfo` object containing information about the component that generated the chunk, 1281 such as the component name and type. 1282 - **index** (<code>int | None</code>) – An optional integer index representing which content block this chunk belongs to. 1283 - **tool_calls** (<code>list\[ToolCallDelta\] | None</code>) – An optional list of ToolCallDelta object representing a tool call associated with the message 1284 chunk. 1285 - **tool_call_result** (<code>ToolCallResult | None</code>) – An optional ToolCallResult object representing the result of a tool call. 1286 - **start** (<code>bool</code>) – A boolean indicating whether this chunk marks the start of a content block. 1287 - **finish_reason** (<code>FinishReason | None</code>) – An optional value indicating the reason the generation finished. 1288 Standard values follow OpenAI's convention: "stop", "length", "tool_calls", "content_filter", 1289 plus Haystack-specific value "tool_call_results". 1290 - **reasoning** (<code>ReasoningContent | None</code>) – An optional ReasoningContent object representing the reasoning content associated 1291 with the message chunk. 1292 1293 #### to_dict 1294 1295 ```python 1296 to_dict() -> dict[str, Any] 1297 ``` 1298 1299 Returns a dictionary representation of the StreamingChunk. 1300 1301 **Returns:** 1302 1303 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the calling object. 1304 1305 #### from_dict 1306 1307 ```python 1308 from_dict(data: dict[str, Any]) -> StreamingChunk 1309 ``` 1310 1311 Creates a deserialized StreamingChunk instance from a serialized representation. 1312 1313 **Parameters:** 1314 1315 - **data** (<code>dict\[str, Any\]</code>) – Dictionary containing the StreamingChunk's attributes. 1316 1317 **Returns:** 1318 1319 - <code>StreamingChunk</code> – A StreamingChunk instance. 1320 1321 ### select_streaming_callback 1322 1323 ```python 1324 select_streaming_callback( 1325 init_callback: StreamingCallbackT | None, 1326 runtime_callback: StreamingCallbackT | None, 1327 requires_async: bool, 1328 ) -> StreamingCallbackT | None 1329 ``` 1330 1331 Picks the correct streaming callback given an optional initial and runtime callback. 1332 1333 The runtime callback takes precedence over the initial callback. 1334 1335 **Parameters:** 1336 1337 - **init_callback** (<code>StreamingCallbackT | None</code>) – The initial callback. 1338 - **runtime_callback** (<code>StreamingCallbackT | None</code>) – The runtime callback. 1339 - **requires_async** (<code>bool</code>) – Whether the selected callback must be async compatible. 1340 1341 **Returns:** 1342 1343 - <code>StreamingCallbackT | None</code> – The selected callback.