data_classes_api.md
1 --- 2 title: "Data Classes" 3 id: data-classes-api 4 description: "Core classes that carry data through the system." 5 slug: "/data-classes-api" 6 --- 7 8 9 ## answer 10 11 ### ExtractedAnswer 12 13 Holds an answer extracted by an extractive Reader (query, score, text, and optional document/context). 14 15 #### to_dict 16 17 ```python 18 to_dict() -> dict[str, Any] 19 ``` 20 21 Serialize the object to a dictionary. 22 23 **Returns:** 24 25 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the object. 26 27 #### from_dict 28 29 ```python 30 from_dict(data: dict[str, Any]) -> ExtractedAnswer 31 ``` 32 33 Deserialize the object from a dictionary. 34 35 **Parameters:** 36 37 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the object. 38 39 **Returns:** 40 41 - <code>ExtractedAnswer</code> – Deserialized object. 42 43 ### GeneratedAnswer 44 45 Holds a generated answer from a Generator (answer text, query, referenced documents, and metadata). 46 47 #### to_dict 48 49 ```python 50 to_dict() -> dict[str, Any] 51 ``` 52 53 Serialize the object to a dictionary. 54 55 **Returns:** 56 57 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the object. 58 59 #### from_dict 60 61 ```python 62 from_dict(data: dict[str, Any]) -> GeneratedAnswer 63 ``` 64 65 Deserialize the object from a dictionary. 66 67 **Parameters:** 68 69 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the object. 70 71 **Returns:** 72 73 - <code>GeneratedAnswer</code> – Deserialized object. 74 75 ## breakpoints 76 77 ### Breakpoint 78 79 A dataclass to hold a breakpoint for a component. 80 81 **Parameters:** 82 83 - **component_name** (<code>str</code>) – The name of the component where the breakpoint is set. 84 - **visit_count** (<code>int</code>) – The number of times the component must be visited before the breakpoint is triggered. 85 - **snapshot_file_path** (<code>str | None</code>) – Optional path to store a snapshot of the pipeline when the breakpoint is hit. 86 This is useful for debugging purposes, allowing you to inspect the state of the pipeline at the time of the 87 breakpoint and to resume execution from that point. 88 89 #### to_dict 90 91 ```python 92 to_dict() -> dict[str, Any] 93 ``` 94 95 Convert the Breakpoint to a dictionary representation. 96 97 **Returns:** 98 99 - <code>dict\[str, Any\]</code> – A dictionary containing the component name, visit count, and debug path. 100 101 #### from_dict 102 103 ```python 104 from_dict(data: dict) -> Breakpoint 105 ``` 106 107 Populate the Breakpoint from a dictionary representation. 108 109 **Parameters:** 110 111 - **data** (<code>dict</code>) – A dictionary containing the component name, visit count, and debug path. 112 113 **Returns:** 114 115 - <code>Breakpoint</code> – An instance of Breakpoint. 116 117 ### ToolBreakpoint 118 119 Bases: <code>Breakpoint</code> 120 121 A dataclass representing a breakpoint specific to tools used within an Agent component. 122 123 Inherits from Breakpoint and adds the ability to target individual tools. If `tool_name` is None, 124 the breakpoint applies to all tools within the Agent component. 125 126 **Parameters:** 127 128 - **tool_name** (<code>str | None</code>) – The name of the tool to target within the Agent component. If None, applies to all tools. 129 130 ### AgentBreakpoint 131 132 A dataclass representing a breakpoint tied to an Agent’s execution. 133 134 This allows for debugging either a specific component (e.g., the chat generator) or a tool used by the agent. 135 It enforces constraints on which component names are valid for each breakpoint type. 136 137 **Parameters:** 138 139 - **agent_name** (<code>str</code>) – The name of the agent component in a pipeline where the breakpoint is set. 140 - **break_point** (<code>Breakpoint | ToolBreakpoint</code>) – An instance of Breakpoint or ToolBreakpoint indicating where to break execution. 141 142 **Raises:** 143 144 - <code>ValueError</code> – If the component_name is invalid for the given breakpoint type: 145 - Breakpoint must have component_name='chat_generator'. 146 - ToolBreakpoint must have component_name='tool_invoker'. 147 148 #### to_dict 149 150 ```python 151 to_dict() -> dict[str, Any] 152 ``` 153 154 Convert the AgentBreakpoint to a dictionary representation. 155 156 **Returns:** 157 158 - <code>dict\[str, Any\]</code> – A dictionary containing the agent name and the breakpoint details. 159 160 #### from_dict 161 162 ```python 163 from_dict(data: dict) -> AgentBreakpoint 164 ``` 165 166 Populate the AgentBreakpoint from a dictionary representation. 167 168 **Parameters:** 169 170 - **data** (<code>dict</code>) – A dictionary containing the agent name and the breakpoint details. 171 172 **Returns:** 173 174 - <code>AgentBreakpoint</code> – An instance of AgentBreakpoint. 175 176 ### AgentSnapshot 177 178 Snapshot of an Agent's state at a breakpoint (component inputs, visit counts, and breakpoint). 179 180 #### to_dict 181 182 ```python 183 to_dict() -> dict[str, Any] 184 ``` 185 186 Convert the AgentSnapshot to a dictionary representation. 187 188 **Returns:** 189 190 - <code>dict\[str, Any\]</code> – A dictionary containing the agent state, timestamp, and breakpoint. 191 192 #### from_dict 193 194 ```python 195 from_dict(data: dict) -> AgentSnapshot 196 ``` 197 198 Populate the AgentSnapshot from a dictionary representation. 199 200 **Parameters:** 201 202 - **data** (<code>dict</code>) – A dictionary containing the agent state, timestamp, and breakpoint. 203 204 **Returns:** 205 206 - <code>AgentSnapshot</code> – An instance of AgentSnapshot. 207 208 ### PipelineState 209 210 A dataclass to hold the state of the pipeline at a specific point in time. 211 212 **Parameters:** 213 214 - **component_visits** (<code>dict\[str, int\]</code>) – A dictionary mapping component names to their visit counts. 215 - **inputs** (<code>dict\[str, Any\]</code>) – The inputs processed by the pipeline at the time of the snapshot. 216 - **pipeline_outputs** (<code>dict\[str, Any\]</code>) – Dictionary containing the final outputs of the pipeline up to the breakpoint. 217 218 #### to_dict 219 220 ```python 221 to_dict() -> dict[str, Any] 222 ``` 223 224 Convert the PipelineState to a dictionary representation. 225 226 **Returns:** 227 228 - <code>dict\[str, Any\]</code> – A dictionary containing the inputs, component visits, 229 and pipeline outputs. 230 231 #### from_dict 232 233 ```python 234 from_dict(data: dict) -> PipelineState 235 ``` 236 237 Populate the PipelineState from a dictionary representation. 238 239 **Parameters:** 240 241 - **data** (<code>dict</code>) – A dictionary containing the inputs, component visits, 242 and pipeline outputs. 243 244 **Returns:** 245 246 - <code>PipelineState</code> – An instance of PipelineState. 247 248 ### PipelineSnapshot 249 250 A dataclass to hold a snapshot of the pipeline at a specific point in time. 251 252 **Parameters:** 253 254 - **original_input_data** (<code>dict\[str, Any\]</code>) – The original input data provided to the pipeline. 255 - **ordered_component_names** (<code>list\[str\]</code>) – A list of component names in the order they were visited. 256 - **pipeline_state** (<code>PipelineState</code>) – The state of the pipeline at the time of the snapshot. 257 - **break_point** (<code>AgentBreakpoint | Breakpoint</code>) – The breakpoint that triggered the snapshot. 258 - **agent_snapshot** (<code>AgentSnapshot | None</code>) – Optional agent snapshot if the breakpoint is an agent breakpoint. 259 - **timestamp** (<code>datetime | None</code>) – A timestamp indicating when the snapshot was taken. 260 - **include_outputs_from** (<code>set\[str\]</code>) – Set of component names whose outputs should be included in the pipeline results. 261 262 #### to_dict 263 264 ```python 265 to_dict() -> dict[str, Any] 266 ``` 267 268 Convert the PipelineSnapshot to a dictionary representation. 269 270 **Returns:** 271 272 - <code>dict\[str, Any\]</code> – A dictionary containing the pipeline state, timestamp, breakpoint, agent snapshot, original input data, 273 ordered component names, include_outputs_from, and pipeline outputs. 274 275 #### from_dict 276 277 ```python 278 from_dict(data: dict) -> PipelineSnapshot 279 ``` 280 281 Populate the PipelineSnapshot from a dictionary representation. 282 283 **Parameters:** 284 285 - **data** (<code>dict</code>) – A dictionary containing the pipeline state, timestamp, breakpoint, agent snapshot, original input 286 data, ordered component names, include_outputs_from, and pipeline outputs. 287 288 ## byte_stream 289 290 ### ByteStream 291 292 Base data class representing a binary object in the Haystack API. 293 294 **Parameters:** 295 296 - **data** (<code>bytes</code>) – The binary data stored in Bytestream. 297 - **meta** (<code>dict\[str, Any\]</code>) – Additional metadata to be stored with the ByteStream. 298 - **mime_type** (<code>str | None</code>) – The mime type of the binary data. 299 300 #### to_file 301 302 ```python 303 to_file(destination_path: Path) -> None 304 ``` 305 306 Write the ByteStream to a file. Note: the metadata will be lost. 307 308 **Parameters:** 309 310 - **destination_path** (<code>Path</code>) – The path to write the ByteStream to. 311 312 #### from_file_path 313 314 ```python 315 from_file_path( 316 filepath: Path, 317 mime_type: str | None = None, 318 meta: dict[str, Any] | None = None, 319 guess_mime_type: bool = False, 320 ) -> ByteStream 321 ``` 322 323 Create a ByteStream from the contents read from a file. 324 325 **Parameters:** 326 327 - **filepath** (<code>Path</code>) – A valid path to a file. 328 - **mime_type** (<code>str | None</code>) – The mime type of the file. 329 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata to be stored with the ByteStream. 330 - **guess_mime_type** (<code>bool</code>) – Whether to guess the mime type from the file. 331 332 #### from_string 333 334 ```python 335 from_string( 336 text: str, 337 encoding: str = "utf-8", 338 mime_type: str | None = None, 339 meta: dict[str, Any] | None = None, 340 ) -> ByteStream 341 ``` 342 343 Create a ByteStream encoding a string. 344 345 **Parameters:** 346 347 - **text** (<code>str</code>) – The string to encode 348 - **encoding** (<code>str</code>) – The encoding used to convert the string into bytes 349 - **mime_type** (<code>str | None</code>) – The mime type of the file. 350 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata to be stored with the ByteStream. 351 352 #### to_string 353 354 ```python 355 to_string(encoding: str = 'utf-8') -> str 356 ``` 357 358 Convert the ByteStream to a string, metadata will not be included. 359 360 **Parameters:** 361 362 - **encoding** (<code>str</code>) – The encoding used to convert the bytes to a string. Defaults to "utf-8". 363 364 **Returns:** 365 366 - <code>str</code> – The string representation of the ByteStream. 367 368 **Raises:** 369 370 - <code>UnicodeDecodeError</code> – If the ByteStream data cannot be decoded with the specified encoding. 371 372 #### to_dict 373 374 ```python 375 to_dict() -> dict[str, Any] 376 ``` 377 378 Convert the ByteStream to a dictionary representation. 379 380 **Returns:** 381 382 - <code>dict\[str, Any\]</code> – A dictionary with keys 'data', 'meta', and 'mime_type'. 383 384 #### from_dict 385 386 ```python 387 from_dict(data: dict[str, Any]) -> ByteStream 388 ``` 389 390 Create a ByteStream from a dictionary representation. 391 392 **Parameters:** 393 394 - **data** (<code>dict\[str, Any\]</code>) – A dictionary with keys 'data', 'meta', and 'mime_type'. 395 396 **Returns:** 397 398 - <code>ByteStream</code> – A ByteStream instance. 399 400 ## chat_message 401 402 ### ChatRole 403 404 Bases: <code>str</code>, <code>Enum</code> 405 406 Enumeration representing the roles within a chat. 407 408 #### from_str 409 410 ```python 411 from_str(string: str) -> ChatRole 412 ``` 413 414 Convert a string to a ChatRole enum. 415 416 ### TextContent 417 418 The textual content of a chat message. 419 420 **Parameters:** 421 422 - **text** (<code>str</code>) – The text content of the message. 423 424 #### to_dict 425 426 ```python 427 to_dict() -> dict[str, Any] 428 ``` 429 430 Convert TextContent into a dictionary. 431 432 #### from_dict 433 434 ```python 435 from_dict(data: dict[str, Any]) -> TextContent 436 ``` 437 438 Create a TextContent from a dictionary. 439 440 ### ToolCall 441 442 Represents a Tool call prepared by the model, usually contained in an assistant message. 443 444 **Parameters:** 445 446 - **id** (<code>str | None</code>) – The ID of the Tool call. 447 - **tool_name** (<code>str</code>) – The name of the Tool to call. 448 - **arguments** (<code>dict\[str, Any\]</code>) – The arguments to call the Tool with. 449 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the Tool call. Use to store provider-specific 450 information. To avoid serialization issues, values should be JSON serializable. 451 452 #### to_dict 453 454 ```python 455 to_dict() -> dict[str, Any] 456 ``` 457 458 Convert ToolCall into a dictionary. 459 460 **Returns:** 461 462 - <code>dict\[str, Any\]</code> – A dictionary with keys 'tool_name', 'arguments', 'id', and 'extra'. 463 464 #### from_dict 465 466 ```python 467 from_dict(data: dict[str, Any]) -> ToolCall 468 ``` 469 470 Creates a new ToolCall object from a dictionary. 471 472 **Parameters:** 473 474 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ToolCall object. 475 476 **Returns:** 477 478 - <code>ToolCall</code> – The created object. 479 480 ### ToolCallResult 481 482 Represents the result of a Tool invocation. 483 484 **Parameters:** 485 486 - **result** (<code>ToolCallResultContentT</code>) – The result of the Tool invocation. 487 - **origin** (<code>ToolCall</code>) – The Tool call that produced this result. 488 - **error** (<code>bool</code>) – Whether the Tool invocation resulted in an error. 489 490 #### to_dict 491 492 ```python 493 to_dict() -> dict[str, Any] 494 ``` 495 496 Converts ToolCallResult into a dictionary. 497 498 **Returns:** 499 500 - <code>dict\[str, Any\]</code> – A dictionary with keys 'result', 'origin', and 'error'. 501 502 #### from_dict 503 504 ```python 505 from_dict(data: dict[str, Any]) -> ToolCallResult 506 ``` 507 508 Creates a ToolCallResult from a dictionary. 509 510 **Parameters:** 511 512 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ToolCallResult object. 513 514 **Returns:** 515 516 - <code>ToolCallResult</code> – The created object. 517 518 ### ReasoningContent 519 520 Represents the optional reasoning content prepared by the model, usually contained in an assistant message. 521 522 **Parameters:** 523 524 - **reasoning_text** (<code>str</code>) – The reasoning text produced by the model. 525 - **extra** (<code>dict\[str, Any\]</code>) – Dictionary of extra information about the reasoning content. Use to store provider-specific 526 information. To avoid serialization issues, values should be JSON serializable. 527 528 #### to_dict 529 530 ```python 531 to_dict() -> dict[str, Any] 532 ``` 533 534 Convert ReasoningContent into a dictionary. 535 536 **Returns:** 537 538 - <code>dict\[str, Any\]</code> – A dictionary with keys 'reasoning_text', and 'extra'. 539 540 #### from_dict 541 542 ```python 543 from_dict(data: dict[str, Any]) -> ReasoningContent 544 ``` 545 546 Creates a new ReasoningContent object from a dictionary. 547 548 **Parameters:** 549 550 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ReasoningContent object. 551 552 **Returns:** 553 554 - <code>ReasoningContent</code> – The created object. 555 556 ### ChatMessage 557 558 Represents a message in a LLM chat conversation. 559 560 Use the `from_assistant`, `from_user`, `from_system`, and `from_tool` class methods to create a ChatMessage. 561 562 #### role 563 564 ```python 565 role: ChatRole 566 ``` 567 568 Returns the role of the entity sending the message. 569 570 #### meta 571 572 ```python 573 meta: dict[str, Any] 574 ``` 575 576 Returns the metadata associated with the message. 577 578 #### name 579 580 ```python 581 name: str | None 582 ``` 583 584 Returns the name associated with the message. 585 586 #### texts 587 588 ```python 589 texts: list[str] 590 ``` 591 592 Returns the list of all texts contained in the message. 593 594 #### text 595 596 ```python 597 text: str | None 598 ``` 599 600 Returns the first text contained in the message. 601 602 #### tool_calls 603 604 ```python 605 tool_calls: list[ToolCall] 606 ``` 607 608 Returns the list of all Tool calls contained in the message. 609 610 #### tool_call 611 612 ```python 613 tool_call: ToolCall | None 614 ``` 615 616 Returns the first Tool call contained in the message. 617 618 #### tool_call_results 619 620 ```python 621 tool_call_results: list[ToolCallResult] 622 ``` 623 624 Returns the list of all Tool call results contained in the message. 625 626 #### tool_call_result 627 628 ```python 629 tool_call_result: ToolCallResult | None 630 ``` 631 632 Returns the first Tool call result contained in the message. 633 634 #### images 635 636 ```python 637 images: list[ImageContent] 638 ``` 639 640 Returns the list of all images contained in the message. 641 642 #### image 643 644 ```python 645 image: ImageContent | None 646 ``` 647 648 Returns the first image contained in the message. 649 650 #### files 651 652 ```python 653 files: list[FileContent] 654 ``` 655 656 Returns the list of all files contained in the message. 657 658 #### file 659 660 ```python 661 file: FileContent | None 662 ``` 663 664 Returns the first file contained in the message. 665 666 #### reasonings 667 668 ```python 669 reasonings: list[ReasoningContent] 670 ``` 671 672 Returns the list of all reasoning contents contained in the message. 673 674 #### reasoning 675 676 ```python 677 reasoning: ReasoningContent | None 678 ``` 679 680 Returns the first reasoning content contained in the message. 681 682 #### is_from 683 684 ```python 685 is_from(role: ChatRole | str) -> bool 686 ``` 687 688 Check if the message is from a specific role. 689 690 **Parameters:** 691 692 - **role** (<code>ChatRole | str</code>) – The role to check against. 693 694 **Returns:** 695 696 - <code>bool</code> – True if the message is from the specified role, False otherwise. 697 698 #### from_user 699 700 ```python 701 from_user( 702 text: str | None = None, 703 meta: dict[str, Any] | None = None, 704 name: str | None = None, 705 *, 706 content_parts: ( 707 Sequence[TextContent | str | ImageContent | FileContent] | None 708 ) = None 709 ) -> ChatMessage 710 ``` 711 712 Create a message from the user. 713 714 **Parameters:** 715 716 - **text** (<code>str | None</code>) – The text content of the message. Specify this or content_parts. 717 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 718 - **name** (<code>str | None</code>) – An optional name for the participant. This field is only supported by OpenAI. 719 - **content_parts** (<code>Sequence\[TextContent | str | ImageContent | FileContent\] | None</code>) – A list of content parts to include in the message. Specify this or text. 720 721 **Returns:** 722 723 - <code>ChatMessage</code> – A new ChatMessage instance. 724 725 **Raises:** 726 727 - <code>ValueError</code> – If neither or both of text and content_parts are provided, or if content_parts is empty. 728 - <code>TypeError</code> – If a content part is not a str, TextContent, ImageContent, or FileContent. 729 730 #### from_system 731 732 ```python 733 from_system( 734 text: str, meta: dict[str, Any] | None = None, name: str | None = None 735 ) -> ChatMessage 736 ``` 737 738 Create a message from the system. 739 740 **Parameters:** 741 742 - **text** (<code>str</code>) – The text content of the message. 743 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 744 - **name** (<code>str | None</code>) – An optional name for the participant. This field is only supported by OpenAI. 745 746 **Returns:** 747 748 - <code>ChatMessage</code> – A new ChatMessage instance. 749 750 #### from_assistant 751 752 ```python 753 from_assistant( 754 text: str | None = None, 755 meta: dict[str, Any] | None = None, 756 name: str | None = None, 757 tool_calls: list[ToolCall] | None = None, 758 *, 759 reasoning: str | ReasoningContent | None = None 760 ) -> ChatMessage 761 ``` 762 763 Create a message from the assistant. 764 765 **Parameters:** 766 767 - **text** (<code>str | None</code>) – The text content of the message. 768 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 769 - **name** (<code>str | None</code>) – An optional name for the participant. This field is only supported by OpenAI. 770 - **tool_calls** (<code>list\[ToolCall\] | None</code>) – The Tool calls to include in the message. 771 - **reasoning** (<code>str | ReasoningContent | None</code>) – The reasoning content to include in the message. 772 773 **Returns:** 774 775 - <code>ChatMessage</code> – A new ChatMessage instance. 776 777 **Raises:** 778 779 - <code>TypeError</code> – If `reasoning` is not a string or ReasoningContent object. 780 781 #### from_tool 782 783 ```python 784 from_tool( 785 tool_result: ToolCallResultContentT, 786 origin: ToolCall, 787 error: bool = False, 788 meta: dict[str, Any] | None = None, 789 ) -> ChatMessage 790 ``` 791 792 Create a message from a Tool. 793 794 **Parameters:** 795 796 - **tool_result** (<code>ToolCallResultContentT</code>) – The result of the Tool invocation. 797 - **origin** (<code>ToolCall</code>) – The Tool call that produced this result. 798 - **error** (<code>bool</code>) – Whether the Tool invocation resulted in an error. 799 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata associated with the message. 800 801 **Returns:** 802 803 - <code>ChatMessage</code> – A new ChatMessage instance. 804 805 #### to_dict 806 807 ```python 808 to_dict() -> dict[str, Any] 809 ``` 810 811 Converts ChatMessage into a dictionary. 812 813 **Returns:** 814 815 - <code>dict\[str, Any\]</code> – Serialized version of the object. 816 817 #### from_dict 818 819 ```python 820 from_dict(data: dict[str, Any]) -> ChatMessage 821 ``` 822 823 Creates a new ChatMessage object from a dictionary. 824 825 **Parameters:** 826 827 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to build the ChatMessage object. 828 829 **Returns:** 830 831 - <code>ChatMessage</code> – The created object. 832 833 **Raises:** 834 835 - <code>ValueError</code> – If the `role` field is missing from the dictionary. 836 - <code>TypeError</code> – If the `content` field is not a list or string. 837 838 #### to_openai_dict_format 839 840 ```python 841 to_openai_dict_format(require_tool_call_ids: bool = True) -> dict[str, Any] 842 ``` 843 844 Convert a ChatMessage to the dictionary format expected by OpenAI's Chat Completions API. 845 846 **Parameters:** 847 848 - **require_tool_call_ids** (<code>bool</code>) – If True (default), enforces that each Tool Call includes a non-null `id` attribute. 849 Set to False to allow Tool Calls without `id`, which may be suitable for shallow OpenAI-compatible APIs. 850 851 **Returns:** 852 853 - <code>dict\[str, Any\]</code> – The ChatMessage in the format expected by OpenAI's Chat Completions API. 854 855 **Raises:** 856 857 - <code>ValueError</code> – If the message format is invalid, or if `require_tool_call_ids` is True and any Tool Call is missing an 858 `id` attribute. 859 860 #### from_openai_dict_format 861 862 ```python 863 from_openai_dict_format(message: dict[str, Any]) -> ChatMessage 864 ``` 865 866 Create a ChatMessage from a dictionary in the format expected by OpenAI's Chat API. 867 868 NOTE: While OpenAI's API requires `tool_call_id` in both tool calls and tool messages, this method 869 accepts messages without it to support shallow OpenAI-compatible APIs. 870 If you plan to use the resulting ChatMessage with OpenAI, you must include `tool_call_id` or you'll 871 encounter validation errors. 872 873 **Parameters:** 874 875 - **message** (<code>dict\[str, Any\]</code>) – The OpenAI dictionary to build the ChatMessage object. 876 877 **Returns:** 878 879 - <code>ChatMessage</code> – The created ChatMessage object. 880 881 **Raises:** 882 883 - <code>ValueError</code> – If the message dictionary is missing required fields. 884 885 ## document 886 887 ### Document 888 889 Base data class containing some data to be queried. 890 891 Can contain text snippets and file paths to images or audios. Documents can be sorted by score and saved 892 to/from dictionary and JSON. 893 894 **Parameters:** 895 896 - **id** (<code>str</code>) – Unique identifier for the document. When not set, it's generated based on the Document fields' values. 897 - **content** (<code>str | None</code>) – Text of the document, if the document contains text. 898 - **blob** (<code>ByteStream | None</code>) – Binary data associated with the document, if the document has any binary data associated with it. 899 - **meta** (<code>dict\[str, Any\]</code>) – Additional custom metadata for the document. Must be JSON-serializable. 900 - **score** (<code>float | None</code>) – Score of the document. Used for ranking, usually assigned by retrievers. 901 - **embedding** (<code>list\[float\] | None</code>) – dense vector representation of the document. 902 - **sparse_embedding** (<code>SparseEmbedding | None</code>) – sparse vector representation of the document. 903 904 #### to_dict 905 906 ```python 907 to_dict(flatten: bool = True) -> dict[str, Any] 908 ``` 909 910 Converts Document into a dictionary. 911 912 `blob` field is converted to a JSON-serializable type. 913 914 **Parameters:** 915 916 - **flatten** (<code>bool</code>) – Whether to flatten `meta` field or not. Defaults to `True` to be backward-compatible with Haystack 1.x. 917 918 #### from_dict 919 920 ```python 921 from_dict(data: dict[str, Any]) -> Document 922 ``` 923 924 Creates a new Document object from a dictionary. 925 926 The `blob` field is converted to its original type. 927 928 #### content_type 929 930 ```python 931 content_type: str 932 ``` 933 934 Returns the type of the content for the document. 935 936 This is necessary to keep backward compatibility with 1.x. 937 938 ## file_content 939 940 ### FileContent 941 942 The file content of a chat message. 943 944 **Parameters:** 945 946 - **base64_data** (<code>str</code>) – A base64 string representing the file. 947 - **mime_type** (<code>str | None</code>) – The MIME type of the file (e.g. "application/pdf"). 948 Providing this value is recommended, as most LLM providers require it. 949 If not provided, the MIME type is guessed from the base64 string, which can be slow and not always reliable. 950 - **filename** (<code>str | None</code>) – Optional filename of the file. Some LLM providers use this information. 951 - **extra** (<code>dict\[str, Any\]</code>) – Dictionary of extra information about the file. Can be used to store provider-specific information. 952 To avoid serialization issues, values should be JSON serializable. 953 - **validation** (<code>bool</code>) – If True (default), a validation process is performed: 954 - Check whether the base64 string is valid; 955 - Guess the MIME type if not provided. 956 Set to False to skip validation and speed up initialization. 957 958 #### to_dict 959 960 ```python 961 to_dict() -> dict[str, Any] 962 ``` 963 964 Convert FileContent into a dictionary. 965 966 #### from_dict 967 968 ```python 969 from_dict(data: dict[str, Any]) -> FileContent 970 ``` 971 972 Create an FileContent from a dictionary. 973 974 #### from_file_path 975 976 ```python 977 from_file_path( 978 file_path: str | Path, 979 *, 980 filename: str | None = None, 981 extra: dict[str, Any] | None = None 982 ) -> FileContent 983 ``` 984 985 Create an FileContent object from a file path. 986 987 **Parameters:** 988 989 - **file_path** (<code>str | Path</code>) – The path to the file. 990 - **filename** (<code>str | None</code>) – Optional file name. Some LLM providers use this information. If not provided, the filename is extracted 991 from the file path. 992 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the file. Can be used to store provider-specific information. 993 To avoid serialization issues, values should be JSON serializable. 994 995 **Returns:** 996 997 - <code>FileContent</code> – An FileContent object. 998 999 #### from_url 1000 1001 ```python 1002 from_url( 1003 url: str, 1004 *, 1005 retry_attempts: int = 2, 1006 timeout: int = 10, 1007 filename: str | None = None, 1008 extra: dict[str, Any] | None = None 1009 ) -> FileContent 1010 ``` 1011 1012 Create an FileContent object from a URL. The file is downloaded and converted to a base64 string. 1013 1014 **Parameters:** 1015 1016 - **url** (<code>str</code>) – The URL of the file. 1017 - **retry_attempts** (<code>int</code>) – The number of times to retry to fetch the URL's content. 1018 - **timeout** (<code>int</code>) – Timeout in seconds for the request. 1019 - **filename** (<code>str | None</code>) – Optional filename of the file. Some LLM providers use this information. If not provided, the filename is 1020 extracted from the URL. 1021 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the file. Can be used to store provider-specific information. 1022 To avoid serialization issues, values should be JSON serializable. 1023 1024 **Returns:** 1025 1026 - <code>FileContent</code> – An FileContent object. 1027 1028 ## image_content 1029 1030 ### ImageContent 1031 1032 The image content of a chat message. 1033 1034 **Parameters:** 1035 1036 - **base64_image** (<code>str</code>) – A base64 string representing the image. 1037 - **mime_type** (<code>str | None</code>) – The MIME type of the image (e.g. "image/png", "image/jpeg"). 1038 Providing this value is recommended, as most LLM providers require it. 1039 If not provided, the MIME type is guessed from the base64 string, which can be slow and not always reliable. 1040 - **detail** (<code>Literal['auto', 'high', 'low'] | None</code>) – Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 1041 - **meta** (<code>dict\[str, Any\]</code>) – Optional metadata for the image. 1042 - **validation** (<code>bool</code>) – If True (default), a validation process is performed: 1043 - Check whether the base64 string is valid; 1044 - Guess the MIME type if not provided; 1045 - Check if the MIME type is a valid image MIME type. 1046 Set to False to skip validation and speed up initialization. 1047 1048 #### show 1049 1050 ```python 1051 show() -> None 1052 ``` 1053 1054 Shows the image. 1055 1056 #### to_dict 1057 1058 ```python 1059 to_dict() -> dict[str, Any] 1060 ``` 1061 1062 Convert ImageContent into a dictionary. 1063 1064 #### from_dict 1065 1066 ```python 1067 from_dict(data: dict[str, Any]) -> ImageContent 1068 ``` 1069 1070 Create an ImageContent from a dictionary. 1071 1072 #### from_file_path 1073 1074 ```python 1075 from_file_path( 1076 file_path: str | Path, 1077 *, 1078 size: tuple[int, int] | None = None, 1079 detail: Literal["auto", "high", "low"] | None = None, 1080 meta: dict[str, Any] | None = None 1081 ) -> ImageContent 1082 ``` 1083 1084 Create an ImageContent object from a file path. 1085 1086 It exposes similar functionality as the `ImageFileToImageContent` component. For PDF to ImageContent conversion, 1087 use the `PDFToImageContent` component. 1088 1089 **Parameters:** 1090 1091 - **file_path** (<code>str | Path</code>) – The path to the image file. PDF files are not supported. For PDF to ImageContent conversion, use the 1092 `PDFToImageContent` component. 1093 - **size** (<code>tuple\[int, int\] | None</code>) – If provided, resizes the image to fit within the specified dimensions (width, height) while 1094 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 1095 when working with models that have resolution constraints or when transmitting images to remote services. 1096 - **detail** (<code>Literal['auto', 'high', 'low'] | None</code>) – Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 1097 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata for the image. 1098 1099 **Returns:** 1100 1101 - <code>ImageContent</code> – An ImageContent object. 1102 1103 #### from_url 1104 1105 ```python 1106 from_url( 1107 url: str, 1108 *, 1109 retry_attempts: int = 2, 1110 timeout: int = 10, 1111 size: tuple[int, int] | None = None, 1112 detail: Literal["auto", "high", "low"] | None = None, 1113 meta: dict[str, Any] | None = None 1114 ) -> ImageContent 1115 ``` 1116 1117 Create an ImageContent object from a URL. The image is downloaded and converted to a base64 string. 1118 1119 For PDF to ImageContent conversion, use the `PDFToImageContent` component. 1120 1121 **Parameters:** 1122 1123 - **url** (<code>str</code>) – The URL of the image. PDF files are not supported. For PDF to ImageContent conversion, use the 1124 `PDFToImageContent` component. 1125 - **retry_attempts** (<code>int</code>) – The number of times to retry to fetch the URL's content. 1126 - **timeout** (<code>int</code>) – Timeout in seconds for the request. 1127 - **size** (<code>tuple\[int, int\] | None</code>) – If provided, resizes the image to fit within the specified dimensions (width, height) while 1128 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 1129 when working with models that have resolution constraints or when transmitting images to remote services. 1130 - **detail** (<code>Literal['auto', 'high', 'low'] | None</code>) – Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 1131 - **meta** (<code>dict\[str, Any\] | None</code>) – Additional metadata for the image. 1132 1133 **Returns:** 1134 1135 - <code>ImageContent</code> – An ImageContent object. 1136 1137 **Raises:** 1138 1139 - <code>ValueError</code> – If the URL does not point to an image or if it points to a PDF file. 1140 1141 ## sparse_embedding 1142 1143 ### SparseEmbedding 1144 1145 Class representing a sparse embedding. 1146 1147 **Parameters:** 1148 1149 - **indices** (<code>list\[int\]</code>) – List of indices of non-zero elements in the embedding. 1150 - **values** (<code>list\[float\]</code>) – List of values of non-zero elements in the embedding. 1151 1152 #### to_dict 1153 1154 ```python 1155 to_dict() -> dict[str, Any] 1156 ``` 1157 1158 Convert the SparseEmbedding object to a dictionary. 1159 1160 **Returns:** 1161 1162 - <code>dict\[str, Any\]</code> – Serialized sparse embedding. 1163 1164 #### from_dict 1165 1166 ```python 1167 from_dict(sparse_embedding_dict: dict[str, Any]) -> SparseEmbedding 1168 ``` 1169 1170 Deserializes the sparse embedding from a dictionary. 1171 1172 **Parameters:** 1173 1174 - **sparse_embedding_dict** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from. 1175 1176 **Returns:** 1177 1178 - <code>SparseEmbedding</code> – Deserialized sparse embedding. 1179 1180 ## streaming_chunk 1181 1182 ### ToolCallDelta 1183 1184 Represents a Tool call prepared by the model, usually contained in an assistant message. 1185 1186 **Parameters:** 1187 1188 - **index** (<code>int</code>) – The index of the Tool call in the list of Tool calls. 1189 - **tool_name** (<code>str | None</code>) – The name of the Tool to call. 1190 - **arguments** (<code>str | None</code>) – Either the full arguments in JSON format or a delta of the arguments. 1191 - **id** (<code>str | None</code>) – The ID of the Tool call. 1192 - **extra** (<code>dict\[str, Any\] | None</code>) – Dictionary of extra information about the Tool call. Use to store provider-specific 1193 information. To avoid serialization issues, values should be JSON serializable. 1194 1195 #### to_dict 1196 1197 ```python 1198 to_dict() -> dict[str, Any] 1199 ``` 1200 1201 Returns a dictionary representation of the ToolCallDelta. 1202 1203 **Returns:** 1204 1205 - <code>dict\[str, Any\]</code> – A dictionary with keys 'index', 'tool_name', 'arguments', 'id', and 'extra'. 1206 1207 #### from_dict 1208 1209 ```python 1210 from_dict(data: dict[str, Any]) -> ToolCallDelta 1211 ``` 1212 1213 Creates a ToolCallDelta from a serialized representation. 1214 1215 **Parameters:** 1216 1217 - **data** (<code>dict\[str, Any\]</code>) – Dictionary containing ToolCallDelta's attributes. 1218 1219 **Returns:** 1220 1221 - <code>ToolCallDelta</code> – A ToolCallDelta instance. 1222 1223 ### ComponentInfo 1224 1225 The `ComponentInfo` class encapsulates information about a component. 1226 1227 **Parameters:** 1228 1229 - **type** (<code>str</code>) – The type of the component. 1230 - **name** (<code>str | None</code>) – The name of the component assigned when adding it to a pipeline. 1231 1232 #### from_component 1233 1234 ```python 1235 from_component(component: Component) -> ComponentInfo 1236 ``` 1237 1238 Create a `ComponentInfo` object from a `Component` instance. 1239 1240 **Parameters:** 1241 1242 - **component** (<code>Component</code>) – The `Component` instance. 1243 1244 **Returns:** 1245 1246 - <code>ComponentInfo</code> – The `ComponentInfo` object with the type and name of the given component. 1247 1248 #### to_dict 1249 1250 ```python 1251 to_dict() -> dict[str, Any] 1252 ``` 1253 1254 Returns a dictionary representation of ComponentInfo. 1255 1256 **Returns:** 1257 1258 - <code>dict\[str, Any\]</code> – A dictionary with keys 'type' and 'name'. 1259 1260 #### from_dict 1261 1262 ```python 1263 from_dict(data: dict[str, Any]) -> ComponentInfo 1264 ``` 1265 1266 Creates a ComponentInfo from a serialized representation. 1267 1268 **Parameters:** 1269 1270 - **data** (<code>dict\[str, Any\]</code>) – Dictionary containing ComponentInfo's attributes. 1271 1272 **Returns:** 1273 1274 - <code>ComponentInfo</code> – A ComponentInfo instance. 1275 1276 ### StreamingChunk 1277 1278 The `StreamingChunk` class encapsulates a segment of streamed content along with associated metadata. 1279 1280 This structure facilitates the handling and processing of streamed data in a systematic manner. 1281 1282 **Parameters:** 1283 1284 - **content** (<code>str</code>) – The content of the message chunk as a string. 1285 - **meta** (<code>dict\[str, Any\]</code>) – A dictionary containing metadata related to the message chunk. 1286 - **component_info** (<code>ComponentInfo | None</code>) – A `ComponentInfo` object containing information about the component that generated the chunk, 1287 such as the component name and type. 1288 - **index** (<code>int | None</code>) – An optional integer index representing which content block this chunk belongs to. 1289 - **tool_calls** (<code>list\[ToolCallDelta\] | None</code>) – An optional list of ToolCallDelta object representing a tool call associated with the message 1290 chunk. 1291 - **tool_call_result** (<code>ToolCallResult | None</code>) – An optional ToolCallResult object representing the result of a tool call. 1292 - **start** (<code>bool</code>) – A boolean indicating whether this chunk marks the start of a content block. 1293 - **finish_reason** (<code>FinishReason | None</code>) – An optional value indicating the reason the generation finished. 1294 Standard values follow OpenAI's convention: "stop", "length", "tool_calls", "content_filter", 1295 plus Haystack-specific value "tool_call_results". 1296 - **reasoning** (<code>ReasoningContent | None</code>) – An optional ReasoningContent object representing the reasoning content associated 1297 with the message chunk. 1298 1299 #### to_dict 1300 1301 ```python 1302 to_dict() -> dict[str, Any] 1303 ``` 1304 1305 Returns a dictionary representation of the StreamingChunk. 1306 1307 **Returns:** 1308 1309 - <code>dict\[str, Any\]</code> – Serialized dictionary representation of the calling object. 1310 1311 #### from_dict 1312 1313 ```python 1314 from_dict(data: dict[str, Any]) -> StreamingChunk 1315 ``` 1316 1317 Creates a deserialized StreamingChunk instance from a serialized representation. 1318 1319 **Parameters:** 1320 1321 - **data** (<code>dict\[str, Any\]</code>) – Dictionary containing the StreamingChunk's attributes. 1322 1323 **Returns:** 1324 1325 - <code>StreamingChunk</code> – A StreamingChunk instance. 1326 1327 ### select_streaming_callback 1328 1329 ```python 1330 select_streaming_callback( 1331 init_callback: StreamingCallbackT | None, 1332 runtime_callback: StreamingCallbackT | None, 1333 requires_async: bool, 1334 ) -> StreamingCallbackT | None 1335 ``` 1336 1337 Picks the correct streaming callback given an optional initial and runtime callback. 1338 1339 The runtime callback takes precedence over the initial callback. 1340 1341 **Parameters:** 1342 1343 - **init_callback** (<code>StreamingCallbackT | None</code>) – The initial callback. 1344 - **runtime_callback** (<code>StreamingCallbackT | None</code>) – The runtime callback. 1345 - **requires_async** (<code>bool</code>) – Whether the selected callback must be async compatible. 1346 1347 **Returns:** 1348 1349 - <code>StreamingCallbackT | None</code> – The selected callback.