tools_api.md
1 --- 2 title: "Tools" 3 id: tools-api 4 description: "Unified abstractions to represent tools across the framework." 5 slug: "/tools-api" 6 --- 7 8 9 ## component_tool 10 11 ### ComponentTool 12 13 Bases: <code>Tool</code> 14 15 A Tool that wraps Haystack components, allowing them to be used as tools by LLMs. 16 17 ComponentTool automatically generates LLM-compatible tool schemas from component input sockets, 18 which are derived from the component's `run` method signature and type hints. 19 20 Key features: 21 22 - Automatic LLM tool calling schema generation from component input sockets 23 - Type conversion and validation for component inputs 24 - Support for types: 25 - Dataclasses 26 - Lists of dataclasses 27 - Basic types (str, int, float, bool, dict) 28 - Lists of basic types 29 - Automatic name generation from component class name 30 - Description extraction from component docstrings 31 32 To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create. 33 You can create a ComponentTool from the component by passing the component to the ComponentTool constructor. 34 Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component. 35 36 ## Usage Example: 37 38 ```python 39 from haystack import component, Pipeline 40 from haystack.tools import ComponentTool 41 from haystack.components.websearch import SerperDevWebSearch 42 from haystack.utils import Secret 43 from haystack.components.tools.tool_invoker import ToolInvoker 44 from haystack.components.generators.chat import OpenAIChatGenerator 45 from haystack.dataclasses import ChatMessage 46 47 # Create a SerperDev search component 48 search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3) 49 50 # Create a tool from the component 51 tool = ComponentTool( 52 component=search, 53 name="web_search", # Optional: defaults to "serper_dev_web_search" 54 description="Search the web for current information on any topic" # Optional: defaults to component docstring 55 ) 56 57 # Create pipeline with OpenAIChatGenerator and ToolInvoker 58 pipeline = Pipeline() 59 pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool])) 60 pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool])) 61 62 # Connect components 63 pipeline.connect("llm.replies", "tool_invoker.messages") 64 65 message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla") 66 67 # Run pipeline 68 result = pipeline.run({"llm": {"messages": [message]}}) 69 70 print(result) 71 ``` 72 73 #### __init__ 74 75 ```python 76 __init__( 77 component: Component, 78 name: str | None = None, 79 description: str | None = None, 80 parameters: dict[str, Any] | None = None, 81 *, 82 outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None, 83 inputs_from_state: dict[str, str] | None = None, 84 outputs_to_state: dict[str, dict[str, str | Callable]] | None = None 85 ) -> None 86 ``` 87 88 Create a Tool instance from a Haystack component. 89 90 **Parameters:** 91 92 - **component** (<code>Component</code>) – The Haystack component to wrap as a tool. 93 - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name). 94 - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring). 95 - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool. 96 Will fall back to the parameters defined in the component's run method signature if not provided. 97 - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 98 If not provided, the tool result is converted to a string using a default handler. 99 100 `outputs_to_string` supports two formats: 101 102 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 103 104 ```python 105 { 106 "source": "docs", "handler": format_documents, "raw_result": False 107 } 108 ``` 109 110 - `source`: If provided, only the specified output key is sent to the handler. 111 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 112 final result. 113 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the 114 `handler` if provided. This is intended for tools that return images. In this mode, the Tool 115 function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to 116 ensure compatibility with Chat Generators. 117 118 1. Multiple output format - map keys to individual configurations: 119 120 ```python 121 { 122 "formatted_docs": {"source": "docs", "handler": format_documents}, 123 "summary": {"source": "summary_text", "handler": str.upper} 124 } 125 ``` 126 127 Each key maps to a dictionary that can contain "source" and/or "handler". 128 Note that `raw_result` is not supported in the multiple output format. 129 130 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 131 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 132 - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 133 If the source is provided only the specified output key is sent to the handler. 134 Example: 135 136 ```python 137 { 138 "documents": {"source": "docs", "handler": custom_handler} 139 } 140 ``` 141 142 If the source is omitted the whole tool result is sent to the handler. 143 Example: 144 145 ```python 146 { 147 "documents": {"handler": custom_handler} 148 } 149 ``` 150 151 **Raises:** 152 153 - <code>TypeError</code> – If the object passed is not a Haystack Component instance. 154 - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails. 155 156 #### warm_up 157 158 ```python 159 warm_up() 160 ``` 161 162 Prepare the ComponentTool for use. 163 164 #### to_dict 165 166 ```python 167 to_dict() -> dict[str, Any] 168 ``` 169 170 Serializes the ComponentTool to a dictionary. 171 172 #### from_dict 173 174 ```python 175 from_dict(data: dict[str, Any]) -> ComponentTool 176 ``` 177 178 Deserializes the ComponentTool from a dictionary. 179 180 ## from_function 181 182 ### create_tool_from_function 183 184 ```python 185 create_tool_from_function( 186 function: Callable, 187 name: str | None = None, 188 description: str | None = None, 189 inputs_from_state: dict[str, str] | None = None, 190 outputs_to_state: dict[str, dict[str, Any]] | None = None, 191 outputs_to_string: dict[str, Any] | None = None, 192 ) -> Tool 193 ``` 194 195 Create a Tool instance from a function. 196 197 Allows customizing the Tool name and description. 198 For simpler use cases, consider using the `@tool` decorator. 199 200 ### Usage example 201 202 ```python 203 from typing import Annotated, Literal 204 from haystack.tools import create_tool_from_function 205 206 def get_weather( 207 city: Annotated[str, "the city for which to get the weather"] = "Munich", 208 unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"): 209 '''A simple function to get the current weather for a location.''' 210 return f"Weather report for {city}: 20 {unit}, sunny" 211 212 tool = create_tool_from_function(get_weather) 213 214 print(tool) 215 >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.', 216 >>> parameters={ 217 >>> 'type': 'object', 218 >>> 'properties': { 219 >>> 'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'}, 220 >>> 'unit': { 221 >>> 'type': 'string', 222 >>> 'enum': ['Celsius', 'Fahrenheit'], 223 >>> 'description': 'the unit for the temperature', 224 >>> 'default': 'Celsius', 225 >>> }, 226 >>> } 227 >>> }, 228 >>> function=<function get_weather at 0x7f7b3a8a9b80>) 229 ``` 230 231 **Parameters:** 232 233 - **function** (<code>Callable</code>) – The function to be converted into a Tool. 234 The function must include type hints for all parameters. 235 The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple). 236 Other input types may work but are not guaranteed. 237 If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description. 238 - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used. 239 - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used. 240 To intentionally leave the description empty, pass an empty string. 241 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 242 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 243 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 244 If the source is provided only the specified output key is sent to the handler. 245 Example: 246 247 ```python 248 { 249 "documents": {"source": "docs", "handler": custom_handler} 250 } 251 ``` 252 253 If the source is omitted the whole tool result is sent to the handler. 254 Example: 255 256 ```python 257 { 258 "documents": {"handler": custom_handler} 259 } 260 ``` 261 262 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 263 If not provided, the tool result is converted to a string using a default handler. 264 265 `outputs_to_string` supports two formats: 266 267 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 268 269 ```python 270 { 271 "source": "docs", "handler": format_documents, "raw_result": False 272 } 273 ``` 274 275 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 276 tool result is sent to the handler. 277 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 278 final result. 279 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 280 if provided. This is intended for tools that return images. In this mode, the Tool function or the 281 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 282 Generators. 283 284 1. Multiple output format - map keys to individual configurations: 285 286 ```python 287 { 288 "formatted_docs": {"source": "docs", "handler": format_documents}, 289 "summary": {"source": "summary_text", "handler": str.upper} 290 } 291 ``` 292 293 Each key maps to a dictionary that can contain "source" and/or "handler". 294 Note that `raw_result` is not supported in the multiple output format. 295 296 **Returns:** 297 298 - <code>Tool</code> – The Tool created from the function. 299 300 **Raises:** 301 302 - <code>ValueError</code> – If any parameter of the function lacks a type hint. 303 - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool. 304 305 ### tool 306 307 ```python 308 tool( 309 function: Callable | None = None, 310 *, 311 name: str | None = None, 312 description: str | None = None, 313 inputs_from_state: dict[str, str] | None = None, 314 outputs_to_state: dict[str, dict[str, Any]] | None = None, 315 outputs_to_string: dict[str, Any] | None = None 316 ) -> Tool | Callable[[Callable], Tool] 317 ``` 318 319 Decorator to convert a function into a Tool. 320 321 Can be used with or without parameters: 322 @tool # without parameters 323 def my_function(): ... 324 325 @tool(name="custom_name") # with parameters 326 def my_function(): ... 327 328 ### Usage example 329 330 ```python 331 from typing import Annotated, Literal 332 from haystack.tools import tool 333 334 @tool 335 def get_weather( 336 city: Annotated[str, "the city for which to get the weather"] = "Munich", 337 unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"): 338 '''A simple function to get the current weather for a location.''' 339 return f"Weather report for {city}: 20 {unit}, sunny" 340 341 print(get_weather) 342 >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.', 343 >>> parameters={ 344 >>> 'type': 'object', 345 >>> 'properties': { 346 >>> 'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'}, 347 >>> 'unit': { 348 >>> 'type': 'string', 349 >>> 'enum': ['Celsius', 'Fahrenheit'], 350 >>> 'description': 'the unit for the temperature', 351 >>> 'default': 'Celsius', 352 >>> }, 353 >>> } 354 >>> }, 355 >>> function=<function get_weather at 0x7f7b3a8a9b80>) 356 ``` 357 358 **Parameters:** 359 360 - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters) 361 - **name** (<code>str | None</code>) – Optional custom name for the tool 362 - **description** (<code>str | None</code>) – Optional custom description 363 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 364 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 365 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 366 If the source is provided only the specified output key is sent to the handler. 367 Example: 368 369 ```python 370 { 371 "documents": {"source": "docs", "handler": custom_handler} 372 } 373 ``` 374 375 If the source is omitted the whole tool result is sent to the handler. 376 Example: 377 378 ```python 379 { 380 "documents": {"handler": custom_handler} 381 } 382 ``` 383 384 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 385 If not provided, the tool result is converted to a string using a default handler. 386 387 `outputs_to_string` supports two formats: 388 389 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 390 391 ```python 392 { 393 "source": "docs", "handler": format_documents, "raw_result": False 394 } 395 ``` 396 397 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 398 tool result is sent to the handler. 399 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 400 final result. 401 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 402 if provided. This is intended for tools that return images. In this mode, the Tool function or the 403 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 404 Generators. 405 406 1. Multiple output format - map keys to individual configurations: 407 408 ```python 409 { 410 "formatted_docs": {"source": "docs", "handler": format_documents}, 411 "summary": {"source": "summary_text", "handler": str.upper} 412 } 413 ``` 414 415 Each key maps to a dictionary that can contain "source" and/or "handler". 416 Note that `raw_result` is not supported in the multiple output format. 417 418 **Returns:** 419 420 - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one 421 422 ## pipeline_tool 423 424 ### PipelineTool 425 426 Bases: <code>ComponentTool</code> 427 428 A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs. 429 430 PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets, 431 which are derived from the underlying components in the pipeline. 432 433 Key features: 434 435 - Automatic LLM tool calling schema generation from pipeline inputs 436 - Description extraction of pipeline inputs based on the underlying component docstrings 437 438 To use PipelineTool, you first need a Haystack pipeline. 439 Below is an example of creating a PipelineTool 440 441 ## Usage Example: 442 443 ```python 444 from haystack import Document, Pipeline 445 from haystack.dataclasses import ChatMessage 446 from haystack.document_stores.in_memory import InMemoryDocumentStore 447 from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder 448 from haystack.components.embedders.sentence_transformers_document_embedder import ( 449 SentenceTransformersDocumentEmbedder 450 ) 451 from haystack.components.generators.chat import OpenAIChatGenerator 452 from haystack.components.retrievers import InMemoryEmbeddingRetriever 453 from haystack.components.agents import Agent 454 from haystack.tools import PipelineTool 455 456 # Initialize a document store and add some documents 457 document_store = InMemoryDocumentStore() 458 document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2") 459 documents = [ 460 Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."), 461 Document( 462 content="He is best known for his contributions to the design of the modern alternating current (AC) " 463 "electricity supply system." 464 ), 465 ] 466 docs_with_embeddings = document_embedder.run(documents=documents)["documents"] 467 document_store.write_documents(docs_with_embeddings) 468 469 # Build a simple retrieval pipeline 470 retrieval_pipeline = Pipeline() 471 retrieval_pipeline.add_component( 472 "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2") 473 ) 474 retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store)) 475 476 retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding") 477 478 # Wrap the pipeline as a tool 479 retriever_tool = PipelineTool( 480 pipeline=retrieval_pipeline, 481 input_mapping={"query": ["embedder.text"]}, 482 output_mapping={"retriever.documents": "documents"}, 483 name="document_retriever", 484 description="For any questions about Nikola Tesla, always use this tool", 485 ) 486 487 # Create an Agent with the tool 488 agent = Agent( 489 chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"), 490 tools=[retriever_tool] 491 ) 492 493 # Let the Agent handle a query 494 result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")]) 495 496 # Print result of the tool call 497 print("Tool Call Result:") 498 print(result["messages"][2].tool_call_result.result) 499 print("") 500 501 # Print answer 502 print("Answer:") 503 print(result["messages"][-1].text) 504 ``` 505 506 #### __init__ 507 508 ```python 509 __init__( 510 pipeline: Pipeline | AsyncPipeline, 511 *, 512 name: str, 513 description: str, 514 input_mapping: dict[str, list[str]] | None = None, 515 output_mapping: dict[str, str] | None = None, 516 parameters: dict[str, Any] | None = None, 517 outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None, 518 inputs_from_state: dict[str, str] | None = None, 519 outputs_to_state: dict[str, dict[str, str | Callable]] | None = None 520 ) -> None 521 ``` 522 523 Create a Tool instance from a Haystack pipeline. 524 525 **Parameters:** 526 527 - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool. 528 - **name** (<code>str</code>) – Name of the tool. 529 - **description** (<code>str</code>) – Description of the tool. 530 - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths. 531 If not provided, a default input mapping will be created based on all pipeline inputs. 532 Example: 533 534 ```python 535 input_mapping={ 536 "query": ["retriever.query", "prompt_builder.query"], 537 } 538 ``` 539 540 - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names. 541 If not provided, a default output mapping will be created based on all pipeline outputs. 542 Example: 543 544 ```python 545 output_mapping={ 546 "retriever.documents": "documents", 547 "generator.replies": "replies", 548 } 549 ``` 550 551 - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool. 552 Will fall back to the parameters defined in the component's run method signature if not provided. 553 - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 554 If not provided, the tool result is converted to a string using a default handler. 555 556 `outputs_to_string` supports two formats: 557 558 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 559 560 ```python 561 { 562 "source": "docs", "handler": format_documents, "raw_result": False 563 } 564 ``` 565 566 - `source`: If provided, only the specified output key is sent to the handler. 567 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 568 final result. 569 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the 570 `handler` if provided. This is intended for tools that return images. In this mode, the Tool 571 function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to 572 ensure compatibility with Chat Generators. 573 574 1. Multiple output format - map keys to individual configurations: 575 576 ```python 577 { 578 "formatted_docs": {"source": "docs", "handler": format_documents}, 579 "summary": {"source": "summary_text", "handler": str.upper} 580 } 581 ``` 582 583 Each key maps to a dictionary that can contain "source" and/or "handler". 584 Note that `raw_result` is not supported in the multiple output format. 585 586 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 587 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 588 - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 589 If the source is provided only the specified output key is sent to the handler. 590 Example: 591 592 ```python 593 { 594 "documents": {"source": "docs", "handler": custom_handler} 595 } 596 ``` 597 598 If the source is omitted the whole tool result is sent to the handler. 599 Example: 600 601 ```python 602 { 603 "documents": {"handler": custom_handler} 604 } 605 ``` 606 607 **Raises:** 608 609 - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance. 610 611 #### to_dict 612 613 ```python 614 to_dict() -> dict[str, Any] 615 ``` 616 617 Serializes the PipelineTool to a dictionary. 618 619 **Returns:** 620 621 - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool. 622 623 #### from_dict 624 625 ```python 626 from_dict(data: dict[str, Any]) -> PipelineTool 627 ``` 628 629 Deserializes the PipelineTool from a dictionary. 630 631 **Parameters:** 632 633 - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool. 634 635 **Returns:** 636 637 - <code>PipelineTool</code> – The deserialized PipelineTool instance. 638 639 ## searchable_toolset 640 641 ### SearchableToolset 642 643 Bases: <code>Toolset</code> 644 645 Dynamic tool discovery from large catalogs using BM25 search. 646 647 This Toolset enables LLMs to discover and use tools from large catalogs through 648 BM25-based search. Instead of exposing all tools at once (which can overwhelm the 649 LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to 650 find and load specific tools as needed. 651 652 For very small catalogs (below `search_threshold`), acts as a simple passthrough 653 exposing all tools directly without any discovery mechanism. 654 655 ### Usage Example 656 657 ```python 658 from haystack.components.agents import Agent 659 from haystack.components.generators.chat import OpenAIChatGenerator 660 from haystack.dataclasses import ChatMessage 661 from haystack.tools import Tool, SearchableToolset 662 663 # Create a catalog of tools 664 catalog = [ 665 Tool(name="get_weather", description="Get weather for a city", ...), 666 Tool(name="search_web", description="Search the web", ...), 667 # ... 100s more tools 668 ] 669 toolset = SearchableToolset(catalog=catalog) 670 671 agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset) 672 673 # The agent is initially provided only with the search_tools tool and will use it to find relevant tools. 674 result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")]) 675 ``` 676 677 #### __init__ 678 679 ```python 680 __init__(catalog: ToolsType, *, top_k: int = 3, search_threshold: int = 8) 681 ``` 682 683 Initialize the SearchableToolset. 684 685 **Parameters:** 686 687 - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset. 688 - **top_k** (<code>int</code>) – Default number of results for search_tools. 689 - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search. 690 If catalog has fewer tools, acts as passthrough (all tools visible). 691 Default is 8. 692 693 #### add 694 695 ```python 696 add(tool: Tool | Toolset) -> None 697 ``` 698 699 Adding new tools after initialization is not supported for SearchableToolset. 700 701 #### warm_up 702 703 ```python 704 warm_up() -> None 705 ``` 706 707 Prepare the toolset for use. 708 709 Warms up child toolsets first (so lazy toolsets like MCPToolset can connect), 710 then flattens the catalog, indexes it, and creates the search_tools bootstrap tool. 711 In passthrough mode, it warms up all catalog tools directly. 712 Must be called before using the toolset with an Agent. 713 714 #### clear 715 716 ```python 717 clear() -> None 718 ``` 719 720 Clear all discovered tools. 721 722 This method allows resetting the toolset's discovered tools between agent runs 723 when the same toolset instance is reused. This can be useful for long-running 724 applications to control memory usage or to start fresh searches. 725 726 #### to_dict 727 728 ```python 729 to_dict() -> dict[str, Any] 730 ``` 731 732 Serialize the toolset to a dictionary. 733 734 **Returns:** 735 736 - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset. 737 738 #### from_dict 739 740 ```python 741 from_dict(data: dict[str, Any]) -> SearchableToolset 742 ``` 743 744 Deserialize a toolset from a dictionary. 745 746 **Parameters:** 747 748 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset. 749 750 **Returns:** 751 752 - <code>SearchableToolset</code> – New SearchableToolset instance. 753 754 ## tool 755 756 ### Tool 757 758 Data class representing a Tool that Language Models can prepare a call for. 759 760 Accurate definitions of the textual attributes such as `name` and `description` 761 are important for the Language Model to correctly prepare the call. 762 763 For resource-intensive operations like establishing connections to remote services or 764 loading models, override the `warm_up()` method. This method is called before the Tool 765 is used and should be idempotent, as it may be called multiple times during 766 pipeline/agent setup. 767 768 **Parameters:** 769 770 - **name** (<code>str</code>) – Name of the Tool. 771 - **description** (<code>str</code>) – Description of the Tool. 772 - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool. 773 - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called. 774 Must be a synchronous function; async functions are not supported. 775 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 776 If not provided, the tool result is converted to a string using a default handler. 777 778 `outputs_to_string` supports two formats: 779 780 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 781 782 ```python 783 { 784 "source": "docs", "handler": format_documents, "raw_result": False 785 } 786 ``` 787 788 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 789 tool result is sent to the handler. 790 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 791 final result. 792 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 793 if provided. This is intended for tools that return images. In this mode, the Tool function or the 794 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 795 Generators. 796 797 1. Multiple output format - map keys to individual configurations: 798 799 ```python 800 { 801 "formatted_docs": {"source": "docs", "handler": format_documents}, 802 "summary": {"source": "summary_text", "handler": str.upper} 803 } 804 ``` 805 806 Each key maps to a dictionary that can contain "source" and/or "handler". 807 Note that `raw_result` is not supported in the multiple output format. 808 809 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 810 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 811 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 812 If the source is provided only the specified output key is sent to the handler. 813 Example: 814 815 ```python 816 { 817 "documents": {"source": "docs", "handler": custom_handler} 818 } 819 ``` 820 821 If the source is omitted the whole tool result is sent to the handler. 822 Example: 823 824 ```python 825 { 826 "documents": {"handler": custom_handler} 827 } 828 ``` 829 830 **Raises:** 831 832 - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the 833 `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid. 834 - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or 835 `inputs_from_state` has the wrong type. 836 837 #### tool_spec 838 839 ```python 840 tool_spec: dict[str, Any] 841 ``` 842 843 Return the Tool specification to be used by the Language Model. 844 845 #### warm_up 846 847 ```python 848 warm_up() -> None 849 ``` 850 851 Prepare the Tool for use. 852 853 Override this method to establish connections to remote services, load models, 854 or perform other resource-intensive initialization. This method should be idempotent, 855 as it may be called multiple times. 856 857 #### invoke 858 859 ```python 860 invoke(**kwargs: Any) -> Any 861 ``` 862 863 Invoke the Tool with the provided keyword arguments. 864 865 #### to_dict 866 867 ```python 868 to_dict() -> dict[str, Any] 869 ``` 870 871 Serializes the Tool to a dictionary. 872 873 **Returns:** 874 875 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 876 877 #### from_dict 878 879 ```python 880 from_dict(data: dict[str, Any]) -> Tool 881 ``` 882 883 Deserializes the Tool from a dictionary. 884 885 **Parameters:** 886 887 - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from. 888 889 **Returns:** 890 891 - <code>Tool</code> – Deserialized Tool. 892 893 ## toolset 894 895 ### Toolset 896 897 A collection of related Tools that can be used and managed as a cohesive unit. 898 899 Toolset serves two main purposes: 900 901 1. Group related tools together: 902 Toolset allows you to organize related tools into a single collection, making it easier 903 to manage and use them as a unit in Haystack pipelines. 904 905 Example: 906 907 ```python 908 from haystack.tools import Tool, Toolset 909 from haystack.components.tools import ToolInvoker 910 911 # Define math functions 912 def add_numbers(a: int, b: int) -> int: 913 return a + b 914 915 def subtract_numbers(a: int, b: int) -> int: 916 return a - b 917 918 # Create tools with proper schemas 919 add_tool = Tool( 920 name="add", 921 description="Add two numbers", 922 parameters={ 923 "type": "object", 924 "properties": { 925 "a": {"type": "integer"}, 926 "b": {"type": "integer"} 927 }, 928 "required": ["a", "b"] 929 }, 930 function=add_numbers 931 ) 932 933 subtract_tool = Tool( 934 name="subtract", 935 description="Subtract b from a", 936 parameters={ 937 "type": "object", 938 "properties": { 939 "a": {"type": "integer"}, 940 "b": {"type": "integer"} 941 }, 942 "required": ["a", "b"] 943 }, 944 function=subtract_numbers 945 ) 946 947 # Create a toolset with the math tools 948 math_toolset = Toolset([add_tool, subtract_tool]) 949 950 # Use the toolset with a ToolInvoker or ChatGenerator component 951 invoker = ToolInvoker(tools=math_toolset) 952 ``` 953 954 1. Base class for dynamic tool loading: 955 By subclassing Toolset, you can create implementations that dynamically load tools 956 from external sources like OpenAPI URLs, MCP servers, or other resources. 957 958 Example: 959 960 ```python 961 from haystack.core.serialization import generate_qualified_class_name 962 from haystack.tools import Tool, Toolset 963 from haystack.components.tools import ToolInvoker 964 965 class CalculatorToolset(Toolset): 966 '''A toolset for calculator operations.''' 967 968 def __init__(self): 969 tools = self._create_tools() 970 super().__init__(tools) 971 972 def _create_tools(self): 973 # These Tool instances are obviously defined statically and for illustration purposes only. 974 # In a real-world scenario, you would dynamically load tools from an external source here. 975 tools = [] 976 add_tool = Tool( 977 name="add", 978 description="Add two numbers", 979 parameters={ 980 "type": "object", 981 "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, 982 "required": ["a", "b"], 983 }, 984 function=lambda a, b: a + b, 985 ) 986 987 multiply_tool = Tool( 988 name="multiply", 989 description="Multiply two numbers", 990 parameters={ 991 "type": "object", 992 "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, 993 "required": ["a", "b"], 994 }, 995 function=lambda a, b: a * b, 996 ) 997 998 tools.append(add_tool) 999 tools.append(multiply_tool) 1000 1001 return tools 1002 1003 def to_dict(self): 1004 return { 1005 "type": generate_qualified_class_name(type(self)), 1006 "data": {}, # no data to serialize as we define the tools dynamically 1007 } 1008 1009 @classmethod 1010 def from_dict(cls, data): 1011 return cls() # Recreate the tools dynamically during deserialization 1012 1013 # Create the dynamic toolset and use it with ToolInvoker 1014 calculator_toolset = CalculatorToolset() 1015 invoker = ToolInvoker(tools=calculator_toolset) 1016 ``` 1017 1018 Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__), 1019 making it behave like a list of Tools. This makes it compatible with components that expect 1020 iterable tools, such as ToolInvoker or Haystack chat generators. 1021 1022 When implementing a custom Toolset subclass for dynamic tool loading: 1023 1024 - Perform the dynamic loading in the __init__ method 1025 - Override to_dict() and from_dict() methods if your tools are defined dynamically 1026 - Serialize endpoint descriptors rather than tool instances if your tools 1027 are loaded from external sources 1028 1029 #### warm_up 1030 1031 ```python 1032 warm_up() -> None 1033 ``` 1034 1035 Prepare the Toolset for use. 1036 1037 By default, this method iterates through and warms up all tools in the Toolset. 1038 Subclasses can override this method to customize initialization behavior, such as: 1039 1040 - Setting up shared resources (database connections, HTTP sessions) instead of 1041 warming individual tools 1042 - Implementing custom initialization logic for dynamically loaded tools 1043 - Controlling when and how tools are initialized 1044 1045 For example, a Toolset that manages tools from an external service (like MCPToolset) 1046 might override this to initialize a shared connection rather than warming up 1047 individual tools: 1048 1049 ```python 1050 class MCPToolset(Toolset): 1051 def warm_up(self) -> None: 1052 # Only warm up the shared MCP connection, not individual tools 1053 self.mcp_connection = establish_connection(self.server_url) 1054 ``` 1055 1056 This method should be idempotent, as it may be called multiple times. 1057 1058 #### add 1059 1060 ```python 1061 add(tool: Union[Tool, Toolset]) -> None 1062 ``` 1063 1064 Add a new Tool or merge another Toolset. 1065 1066 **Parameters:** 1067 1068 - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add 1069 1070 **Raises:** 1071 1072 - <code>ValueError</code> – If adding the tool would result in duplicate tool names 1073 - <code>TypeError</code> – If the provided object is not a Tool or Toolset 1074 1075 #### to_dict 1076 1077 ```python 1078 to_dict() -> dict[str, Any] 1079 ``` 1080 1081 Serialize the Toolset to a dictionary. 1082 1083 **Returns:** 1084 1085 - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset 1086 1087 Note for subclass implementers: 1088 The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass 1089 of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or 1090 a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool 1091 instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead 1092 associated with serializing potentially large collections of Tool objects. Moreover, by serializing the 1093 descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even 1094 if they have been modified or removed since the last serialization. Failing to serialize the descriptor may 1095 lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or 1096 unexpected behavior. 1097 1098 #### from_dict 1099 1100 ```python 1101 from_dict(data: dict[str, Any]) -> Toolset 1102 ``` 1103 1104 Deserialize a Toolset from a dictionary. 1105 1106 **Parameters:** 1107 1108 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset 1109 1110 **Returns:** 1111 1112 - <code>Toolset</code> – A new Toolset instance