tools_api.md
1 --- 2 title: "Tools" 3 id: tools-api 4 description: "Unified abstractions to represent tools across the framework." 5 slug: "/tools-api" 6 --- 7 8 9 ## component_tool 10 11 ### ComponentTool 12 13 Bases: <code>Tool</code> 14 15 A Tool that wraps Haystack components, allowing them to be used as tools by LLMs. 16 17 ComponentTool automatically generates LLM-compatible tool schemas from component input sockets, 18 which are derived from the component's `run` method signature and type hints. 19 20 Key features: 21 22 - Automatic LLM tool calling schema generation from component input sockets 23 - Type conversion and validation for component inputs 24 - Support for types: 25 - Dataclasses 26 - Lists of dataclasses 27 - Basic types (str, int, float, bool, dict) 28 - Lists of basic types 29 - Automatic name generation from component class name 30 - Description extraction from component docstrings 31 32 To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create. 33 You can create a ComponentTool from the component by passing the component to the ComponentTool constructor. 34 Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component. 35 36 ## Usage Example: 37 38 ```python 39 from haystack import component, Pipeline 40 from haystack.tools import ComponentTool 41 from haystack.components.websearch import SerperDevWebSearch 42 from haystack.utils import Secret 43 from haystack.components.tools.tool_invoker import ToolInvoker 44 from haystack.components.generators.chat import OpenAIChatGenerator 45 from haystack.dataclasses import ChatMessage 46 47 # Create a SerperDev search component 48 search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3) 49 50 # Create a tool from the component 51 tool = ComponentTool( 52 component=search, 53 name="web_search", # Optional: defaults to "serper_dev_web_search" 54 description="Search the web for current information on any topic" # Optional: defaults to component docstring 55 ) 56 57 # Create pipeline with OpenAIChatGenerator and ToolInvoker 58 pipeline = Pipeline() 59 pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool])) 60 pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool])) 61 62 # Connect components 63 pipeline.connect("llm.replies", "tool_invoker.messages") 64 65 message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla") 66 67 # Run pipeline 68 result = pipeline.run({"llm": {"messages": [message]}}) 69 70 print(result) 71 ``` 72 73 #### __init__ 74 75 ```python 76 __init__( 77 component: Component, 78 name: str | None = None, 79 description: str | None = None, 80 parameters: dict[str, Any] | None = None, 81 *, 82 outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None, 83 inputs_from_state: dict[str, str] | None = None, 84 outputs_to_state: dict[str, dict[str, str | Callable]] | None = None 85 ) -> None 86 ``` 87 88 Create a Tool instance from a Haystack component. 89 90 **Parameters:** 91 92 - **component** (<code>Component</code>) – The Haystack component to wrap as a tool. 93 - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name). 94 - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring). 95 - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool. 96 Will fall back to the parameters defined in the component's run method signature if not provided. 97 - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 98 If not provided, the tool result is converted to a string using a default handler. 99 100 `outputs_to_string` supports two formats: 101 102 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 103 104 ```python 105 { 106 "source": "docs", "handler": format_documents, "raw_result": False 107 } 108 ``` 109 110 - `source`: If provided, only the specified output key is sent to the handler. 111 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 112 final result. 113 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the 114 `handler` if provided. This is intended for tools that return images. In this mode, the Tool 115 function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to 116 ensure compatibility with Chat Generators. 117 118 1. Multiple output format - map keys to individual configurations: 119 120 ```python 121 { 122 "formatted_docs": {"source": "docs", "handler": format_documents}, 123 "summary": {"source": "summary_text", "handler": str.upper} 124 } 125 ``` 126 127 Each key maps to a dictionary that can contain "source" and/or "handler". 128 Note that `raw_result` is not supported in the multiple output format. 129 130 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 131 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 132 - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 133 If the source is provided only the specified output key is sent to the handler. 134 Example: 135 136 ```python 137 { 138 "documents": {"source": "docs", "handler": custom_handler} 139 } 140 ``` 141 142 If the source is omitted the whole tool result is sent to the handler. 143 Example: 144 145 ```python 146 { 147 "documents": {"handler": custom_handler} 148 } 149 ``` 150 151 **Raises:** 152 153 - <code>TypeError</code> – If the object passed is not a Haystack Component instance. 154 - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails. 155 156 #### warm_up 157 158 ```python 159 warm_up() 160 ``` 161 162 Prepare the ComponentTool for use. 163 164 #### to_dict 165 166 ```python 167 to_dict() -> dict[str, Any] 168 ``` 169 170 Serializes the ComponentTool to a dictionary. 171 172 #### from_dict 173 174 ```python 175 from_dict(data: dict[str, Any]) -> ComponentTool 176 ``` 177 178 Deserializes the ComponentTool from a dictionary. 179 180 ## from_function 181 182 ### create_tool_from_function 183 184 ```python 185 create_tool_from_function( 186 function: Callable, 187 name: str | None = None, 188 description: str | None = None, 189 inputs_from_state: dict[str, str] | None = None, 190 outputs_to_state: dict[str, dict[str, Any]] | None = None, 191 outputs_to_string: dict[str, Any] | None = None, 192 ) -> Tool 193 ``` 194 195 Create a Tool instance from a function. 196 197 Allows customizing the Tool name and description. 198 For simpler use cases, consider using the `@tool` decorator. 199 200 ### Usage example 201 202 ```python 203 from typing import Annotated, Literal 204 from haystack.tools import create_tool_from_function 205 206 def get_weather( 207 city: Annotated[str, "the city for which to get the weather"] = "Munich", 208 unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"): 209 '''A simple function to get the current weather for a location.''' 210 return f"Weather report for {city}: 20 {unit}, sunny" 211 212 tool = create_tool_from_function(get_weather) 213 214 print(tool) 215 >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.', 216 >>> parameters={ 217 >>> 'type': 'object', 218 >>> 'properties': { 219 >>> 'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'}, 220 >>> 'unit': { 221 >>> 'type': 'string', 222 >>> 'enum': ['Celsius', 'Fahrenheit'], 223 >>> 'description': 'the unit for the temperature', 224 >>> 'default': 'Celsius', 225 >>> }, 226 >>> } 227 >>> }, 228 >>> function=<function get_weather at 0x7f7b3a8a9b80>) 229 ``` 230 231 **Parameters:** 232 233 - **function** (<code>Callable</code>) – The function to be converted into a Tool. 234 The function must include type hints for all parameters. 235 The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple). 236 Other input types may work but are not guaranteed. 237 If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description. 238 - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used. 239 - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used. 240 To intentionally leave the description empty, pass an empty string. 241 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 242 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 243 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 244 If the source is provided only the specified output key is sent to the handler. 245 Example: 246 247 ```python 248 { 249 "documents": {"source": "docs", "handler": custom_handler} 250 } 251 ``` 252 253 If the source is omitted the whole tool result is sent to the handler. 254 Example: 255 256 ```python 257 { 258 "documents": {"handler": custom_handler} 259 } 260 ``` 261 262 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 263 If not provided, the tool result is converted to a string using a default handler. 264 265 `outputs_to_string` supports two formats: 266 267 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 268 269 ```python 270 { 271 "source": "docs", "handler": format_documents, "raw_result": False 272 } 273 ``` 274 275 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 276 tool result is sent to the handler. 277 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 278 final result. 279 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 280 if provided. This is intended for tools that return images. In this mode, the Tool function or the 281 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 282 Generators. 283 284 1. Multiple output format - map keys to individual configurations: 285 286 ```python 287 { 288 "formatted_docs": {"source": "docs", "handler": format_documents}, 289 "summary": {"source": "summary_text", "handler": str.upper} 290 } 291 ``` 292 293 Each key maps to a dictionary that can contain "source" and/or "handler". 294 Note that `raw_result` is not supported in the multiple output format. 295 296 **Returns:** 297 298 - <code>Tool</code> – The Tool created from the function. 299 300 **Raises:** 301 302 - <code>ValueError</code> – If any parameter of the function lacks a type hint. 303 - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool. 304 305 ### tool 306 307 ```python 308 tool( 309 function: Callable | None = None, 310 *, 311 name: str | None = None, 312 description: str | None = None, 313 inputs_from_state: dict[str, str] | None = None, 314 outputs_to_state: dict[str, dict[str, Any]] | None = None, 315 outputs_to_string: dict[str, Any] | None = None 316 ) -> Tool | Callable[[Callable], Tool] 317 ``` 318 319 Decorator to convert a function into a Tool. 320 321 Can be used with or without parameters: 322 @tool # without parameters 323 def my_function(): ... 324 325 @tool(name="custom_name") # with parameters 326 def my_function(): ... 327 328 ### Usage example 329 330 ```python 331 from typing import Annotated, Literal 332 from haystack.tools import tool 333 334 @tool 335 def get_weather( 336 city: Annotated[str, "the city for which to get the weather"] = "Munich", 337 unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"): 338 '''A simple function to get the current weather for a location.''' 339 return f"Weather report for {city}: 20 {unit}, sunny" 340 341 print(get_weather) 342 >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.', 343 >>> parameters={ 344 >>> 'type': 'object', 345 >>> 'properties': { 346 >>> 'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'}, 347 >>> 'unit': { 348 >>> 'type': 'string', 349 >>> 'enum': ['Celsius', 'Fahrenheit'], 350 >>> 'description': 'the unit for the temperature', 351 >>> 'default': 'Celsius', 352 >>> }, 353 >>> } 354 >>> }, 355 >>> function=<function get_weather at 0x7f7b3a8a9b80>) 356 ``` 357 358 **Parameters:** 359 360 - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters) 361 - **name** (<code>str | None</code>) – Optional custom name for the tool 362 - **description** (<code>str | None</code>) – Optional custom description 363 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 364 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 365 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 366 If the source is provided only the specified output key is sent to the handler. 367 Example: 368 369 ```python 370 { 371 "documents": {"source": "docs", "handler": custom_handler} 372 } 373 ``` 374 375 If the source is omitted the whole tool result is sent to the handler. 376 Example: 377 378 ```python 379 { 380 "documents": {"handler": custom_handler} 381 } 382 ``` 383 384 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 385 If not provided, the tool result is converted to a string using a default handler. 386 387 `outputs_to_string` supports two formats: 388 389 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 390 391 ```python 392 { 393 "source": "docs", "handler": format_documents, "raw_result": False 394 } 395 ``` 396 397 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 398 tool result is sent to the handler. 399 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 400 final result. 401 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 402 if provided. This is intended for tools that return images. In this mode, the Tool function or the 403 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 404 Generators. 405 406 1. Multiple output format - map keys to individual configurations: 407 408 ```python 409 { 410 "formatted_docs": {"source": "docs", "handler": format_documents}, 411 "summary": {"source": "summary_text", "handler": str.upper} 412 } 413 ``` 414 415 Each key maps to a dictionary that can contain "source" and/or "handler". 416 Note that `raw_result` is not supported in the multiple output format. 417 418 **Returns:** 419 420 - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one 421 422 ## pipeline_tool 423 424 ### PipelineTool 425 426 Bases: <code>ComponentTool</code> 427 428 A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs. 429 430 PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets, 431 which are derived from the underlying components in the pipeline. 432 433 Key features: 434 435 - Automatic LLM tool calling schema generation from pipeline inputs 436 - Description extraction of pipeline inputs based on the underlying component docstrings 437 438 To use PipelineTool, you first need a Haystack pipeline. 439 Below is an example of creating a PipelineTool 440 441 ## Usage Example: 442 443 ```python 444 from haystack import Document, Pipeline 445 from haystack.dataclasses import ChatMessage 446 from haystack.document_stores.in_memory import InMemoryDocumentStore 447 from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder 448 from haystack.components.embedders.sentence_transformers_document_embedder import ( 449 SentenceTransformersDocumentEmbedder 450 ) 451 from haystack.components.generators.chat import OpenAIChatGenerator 452 from haystack.components.retrievers import InMemoryEmbeddingRetriever 453 from haystack.components.agents import Agent 454 from haystack.tools import PipelineTool 455 456 # Initialize a document store and add some documents 457 document_store = InMemoryDocumentStore() 458 document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2") 459 documents = [ 460 Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."), 461 Document( 462 content="He is best known for his contributions to the design of the modern alternating current (AC) " 463 "electricity supply system." 464 ), 465 ] 466 docs_with_embeddings = document_embedder.run(documents=documents)["documents"] 467 document_store.write_documents(docs_with_embeddings) 468 469 # Build a simple retrieval pipeline 470 retrieval_pipeline = Pipeline() 471 retrieval_pipeline.add_component( 472 "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2") 473 ) 474 retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store)) 475 476 retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding") 477 478 # Wrap the pipeline as a tool 479 retriever_tool = PipelineTool( 480 pipeline=retrieval_pipeline, 481 input_mapping={"query": ["embedder.text"]}, 482 output_mapping={"retriever.documents": "documents"}, 483 name="document_retriever", 484 description="For any questions about Nikola Tesla, always use this tool", 485 ) 486 487 # Create an Agent with the tool 488 agent = Agent( 489 chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"), 490 tools=[retriever_tool] 491 ) 492 493 # Let the Agent handle a query 494 result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")]) 495 496 # Print result of the tool call 497 print("Tool Call Result:") 498 print(result["messages"][2].tool_call_result.result) 499 print("") 500 501 # Print answer 502 print("Answer:") 503 print(result["messages"][-1].text) 504 ``` 505 506 #### __init__ 507 508 ```python 509 __init__( 510 pipeline: Pipeline | AsyncPipeline, 511 *, 512 name: str, 513 description: str, 514 input_mapping: dict[str, list[str]] | None = None, 515 output_mapping: dict[str, str] | None = None, 516 parameters: dict[str, Any] | None = None, 517 outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None, 518 inputs_from_state: dict[str, str] | None = None, 519 outputs_to_state: dict[str, dict[str, str | Callable]] | None = None 520 ) -> None 521 ``` 522 523 Create a Tool instance from a Haystack pipeline. 524 525 **Parameters:** 526 527 - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool. 528 - **name** (<code>str</code>) – Name of the tool. 529 - **description** (<code>str</code>) – Description of the tool. 530 - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths. 531 If not provided, a default input mapping will be created based on all pipeline inputs. 532 Example: 533 534 ```python 535 input_mapping={ 536 "query": ["retriever.query", "prompt_builder.query"], 537 } 538 ``` 539 540 - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names. 541 If not provided, a default output mapping will be created based on all pipeline outputs. 542 Example: 543 544 ```python 545 output_mapping={ 546 "retriever.documents": "documents", 547 "generator.replies": "replies", 548 } 549 ``` 550 551 - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool. 552 Will fall back to the parameters defined in the component's run method signature if not provided. 553 - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 554 If not provided, the tool result is converted to a string using a default handler. 555 556 `outputs_to_string` supports two formats: 557 558 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 559 560 ```python 561 { 562 "source": "docs", "handler": format_documents, "raw_result": False 563 } 564 ``` 565 566 - `source`: If provided, only the specified output key is sent to the handler. 567 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 568 final result. 569 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the 570 `handler` if provided. This is intended for tools that return images. In this mode, the Tool 571 function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to 572 ensure compatibility with Chat Generators. 573 574 1. Multiple output format - map keys to individual configurations: 575 576 ```python 577 { 578 "formatted_docs": {"source": "docs", "handler": format_documents}, 579 "summary": {"source": "summary_text", "handler": str.upper} 580 } 581 ``` 582 583 Each key maps to a dictionary that can contain "source" and/or "handler". 584 Note that `raw_result` is not supported in the multiple output format. 585 586 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 587 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 588 - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 589 If the source is provided only the specified output key is sent to the handler. 590 Example: 591 592 ```python 593 { 594 "documents": {"source": "docs", "handler": custom_handler} 595 } 596 ``` 597 598 If the source is omitted the whole tool result is sent to the handler. 599 Example: 600 601 ```python 602 { 603 "documents": {"handler": custom_handler} 604 } 605 ``` 606 607 **Raises:** 608 609 - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance. 610 611 #### to_dict 612 613 ```python 614 to_dict() -> dict[str, Any] 615 ``` 616 617 Serializes the PipelineTool to a dictionary. 618 619 **Returns:** 620 621 - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool. 622 623 #### from_dict 624 625 ```python 626 from_dict(data: dict[str, Any]) -> PipelineTool 627 ``` 628 629 Deserializes the PipelineTool from a dictionary. 630 631 **Parameters:** 632 633 - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool. 634 635 **Returns:** 636 637 - <code>PipelineTool</code> – The deserialized PipelineTool instance. 638 639 ## searchable_toolset 640 641 ### SearchableToolset 642 643 Bases: <code>Toolset</code> 644 645 Dynamic tool discovery from large catalogs using BM25 search. 646 647 This Toolset enables LLMs to discover and use tools from large catalogs through 648 BM25-based search. Instead of exposing all tools at once (which can overwhelm the 649 LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to 650 find and load specific tools as needed. 651 652 For very small catalogs (below `search_threshold`), acts as a simple passthrough 653 exposing all tools directly without any discovery mechanism. 654 655 ### Usage Example 656 657 ```python 658 from haystack.components.agents import Agent 659 from haystack.components.generators.chat import OpenAIChatGenerator 660 from haystack.dataclasses import ChatMessage 661 from haystack.tools import Tool, SearchableToolset 662 663 # Create a catalog of tools 664 catalog = [ 665 Tool(name="get_weather", description="Get weather for a city", ...), 666 Tool(name="search_web", description="Search the web", ...), 667 # ... 100s more tools 668 ] 669 toolset = SearchableToolset(catalog=catalog) 670 671 agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset) 672 673 # The agent is initially provided only with the search_tools tool and will use it to find relevant tools. 674 result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")]) 675 ``` 676 677 #### __init__ 678 679 ```python 680 __init__( 681 catalog: ToolsType, 682 *, 683 top_k: int = 3, 684 search_threshold: int = 8, 685 search_tool_name: str = "search_tools", 686 search_tool_description: str | None = None, 687 search_tool_parameters_description: dict[str, str] | None = None 688 ) 689 ``` 690 691 Initialize the SearchableToolset. 692 693 **Parameters:** 694 695 - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset. 696 - **top_k** (<code>int</code>) – Default number of results for search_tools. 697 - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search. 698 If catalog has fewer tools, acts as passthrough (all tools visible). 699 Default is 8. 700 - **search_tool_name** (<code>str</code>) – Custom name for the bootstrap search tool. Default is "search_tools". 701 - **search_tool_description** (<code>str | None</code>) – Custom description for the bootstrap search tool. 702 If not provided, uses a default description. 703 - **search_tool_parameters_description** (<code>dict\[str, str\] | None</code>) – Custom descriptions for the bootstrap search tool's parameters. 704 Keys must be a subset of `{"tool_keywords", "k"}`. 705 Example: `{"tool_keywords": "Keywords to find tools, e.g. 'email send'"}` 706 707 #### add 708 709 ```python 710 add(tool: Tool | Toolset) -> None 711 ``` 712 713 Adding new tools after initialization is not supported for SearchableToolset. 714 715 #### warm_up 716 717 ```python 718 warm_up() -> None 719 ``` 720 721 Prepare the toolset for use. 722 723 Warms up child toolsets first (so lazy toolsets like MCPToolset can connect), 724 then flattens the catalog, indexes it, and creates the search_tools bootstrap tool. 725 In passthrough mode, it warms up all catalog tools directly. 726 Must be called before using the toolset with an Agent. 727 728 #### clear 729 730 ```python 731 clear() -> None 732 ``` 733 734 Clear all discovered tools. 735 736 This method allows resetting the toolset's discovered tools between agent runs 737 when the same toolset instance is reused. This can be useful for long-running 738 applications to control memory usage or to start fresh searches. 739 740 #### to_dict 741 742 ```python 743 to_dict() -> dict[str, Any] 744 ``` 745 746 Serialize the toolset to a dictionary. 747 748 **Returns:** 749 750 - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset. 751 752 #### from_dict 753 754 ```python 755 from_dict(data: dict[str, Any]) -> SearchableToolset 756 ``` 757 758 Deserialize a toolset from a dictionary. 759 760 **Parameters:** 761 762 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset. 763 764 **Returns:** 765 766 - <code>SearchableToolset</code> – New SearchableToolset instance. 767 768 ## tool 769 770 ### Tool 771 772 Data class representing a Tool that Language Models can prepare a call for. 773 774 Accurate definitions of the textual attributes such as `name` and `description` 775 are important for the Language Model to correctly prepare the call. 776 777 For resource-intensive operations like establishing connections to remote services or 778 loading models, override the `warm_up()` method. This method is called before the Tool 779 is used and should be idempotent, as it may be called multiple times during 780 pipeline/agent setup. 781 782 **Parameters:** 783 784 - **name** (<code>str</code>) – Name of the Tool. 785 - **description** (<code>str</code>) – Description of the Tool. 786 - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool. 787 - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called. 788 Must be a synchronous function; async functions are not supported. 789 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 790 If not provided, the tool result is converted to a string using a default handler. 791 792 `outputs_to_string` supports two formats: 793 794 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 795 796 ```python 797 { 798 "source": "docs", "handler": format_documents, "raw_result": False 799 } 800 ``` 801 802 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 803 tool result is sent to the handler. 804 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 805 final result. 806 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 807 if provided. This is intended for tools that return images. In this mode, the Tool function or the 808 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 809 Generators. 810 811 1. Multiple output format - map keys to individual configurations: 812 813 ```python 814 { 815 "formatted_docs": {"source": "docs", "handler": format_documents}, 816 "summary": {"source": "summary_text", "handler": str.upper} 817 } 818 ``` 819 820 Each key maps to a dictionary that can contain "source" and/or "handler". 821 Note that `raw_result` is not supported in the multiple output format. 822 823 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 824 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 825 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 826 If the source is provided only the specified output key is sent to the handler. 827 Example: 828 829 ```python 830 { 831 "documents": {"source": "docs", "handler": custom_handler} 832 } 833 ``` 834 835 If the source is omitted the whole tool result is sent to the handler. 836 Example: 837 838 ```python 839 { 840 "documents": {"handler": custom_handler} 841 } 842 ``` 843 844 **Raises:** 845 846 - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the 847 `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid. 848 - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or 849 `inputs_from_state` has the wrong type. 850 851 #### tool_spec 852 853 ```python 854 tool_spec: dict[str, Any] 855 ``` 856 857 Return the Tool specification to be used by the Language Model. 858 859 #### warm_up 860 861 ```python 862 warm_up() -> None 863 ``` 864 865 Prepare the Tool for use. 866 867 Override this method to establish connections to remote services, load models, 868 or perform other resource-intensive initialization. This method should be idempotent, 869 as it may be called multiple times. 870 871 #### invoke 872 873 ```python 874 invoke(**kwargs: Any) -> Any 875 ``` 876 877 Invoke the Tool with the provided keyword arguments. 878 879 #### to_dict 880 881 ```python 882 to_dict() -> dict[str, Any] 883 ``` 884 885 Serializes the Tool to a dictionary. 886 887 **Returns:** 888 889 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 890 891 #### from_dict 892 893 ```python 894 from_dict(data: dict[str, Any]) -> Tool 895 ``` 896 897 Deserializes the Tool from a dictionary. 898 899 **Parameters:** 900 901 - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from. 902 903 **Returns:** 904 905 - <code>Tool</code> – Deserialized Tool. 906 907 ## toolset 908 909 ### Toolset 910 911 A collection of related Tools that can be used and managed as a cohesive unit. 912 913 Toolset serves two main purposes: 914 915 1. Group related tools together: 916 Toolset allows you to organize related tools into a single collection, making it easier 917 to manage and use them as a unit in Haystack pipelines. 918 919 Example: 920 921 ```python 922 from haystack.tools import Tool, Toolset 923 from haystack.components.tools import ToolInvoker 924 925 # Define math functions 926 def add_numbers(a: int, b: int) -> int: 927 return a + b 928 929 def subtract_numbers(a: int, b: int) -> int: 930 return a - b 931 932 # Create tools with proper schemas 933 add_tool = Tool( 934 name="add", 935 description="Add two numbers", 936 parameters={ 937 "type": "object", 938 "properties": { 939 "a": {"type": "integer"}, 940 "b": {"type": "integer"} 941 }, 942 "required": ["a", "b"] 943 }, 944 function=add_numbers 945 ) 946 947 subtract_tool = Tool( 948 name="subtract", 949 description="Subtract b from a", 950 parameters={ 951 "type": "object", 952 "properties": { 953 "a": {"type": "integer"}, 954 "b": {"type": "integer"} 955 }, 956 "required": ["a", "b"] 957 }, 958 function=subtract_numbers 959 ) 960 961 # Create a toolset with the math tools 962 math_toolset = Toolset([add_tool, subtract_tool]) 963 964 # Use the toolset with a ToolInvoker or ChatGenerator component 965 invoker = ToolInvoker(tools=math_toolset) 966 ``` 967 968 1. Base class for dynamic tool loading: 969 By subclassing Toolset, you can create implementations that dynamically load tools 970 from external sources like OpenAPI URLs, MCP servers, or other resources. 971 972 Example: 973 974 ```python 975 from haystack.core.serialization import generate_qualified_class_name 976 from haystack.tools import Tool, Toolset 977 from haystack.components.tools import ToolInvoker 978 979 class CalculatorToolset(Toolset): 980 '''A toolset for calculator operations.''' 981 982 def __init__(self): 983 tools = self._create_tools() 984 super().__init__(tools) 985 986 def _create_tools(self): 987 # These Tool instances are obviously defined statically and for illustration purposes only. 988 # In a real-world scenario, you would dynamically load tools from an external source here. 989 tools = [] 990 add_tool = Tool( 991 name="add", 992 description="Add two numbers", 993 parameters={ 994 "type": "object", 995 "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, 996 "required": ["a", "b"], 997 }, 998 function=lambda a, b: a + b, 999 ) 1000 1001 multiply_tool = Tool( 1002 name="multiply", 1003 description="Multiply two numbers", 1004 parameters={ 1005 "type": "object", 1006 "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, 1007 "required": ["a", "b"], 1008 }, 1009 function=lambda a, b: a * b, 1010 ) 1011 1012 tools.append(add_tool) 1013 tools.append(multiply_tool) 1014 1015 return tools 1016 1017 def to_dict(self): 1018 return { 1019 "type": generate_qualified_class_name(type(self)), 1020 "data": {}, # no data to serialize as we define the tools dynamically 1021 } 1022 1023 @classmethod 1024 def from_dict(cls, data): 1025 return cls() # Recreate the tools dynamically during deserialization 1026 1027 # Create the dynamic toolset and use it with ToolInvoker 1028 calculator_toolset = CalculatorToolset() 1029 invoker = ToolInvoker(tools=calculator_toolset) 1030 ``` 1031 1032 Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__), 1033 making it behave like a list of Tools. This makes it compatible with components that expect 1034 iterable tools, such as ToolInvoker or Haystack chat generators. 1035 1036 When implementing a custom Toolset subclass for dynamic tool loading: 1037 1038 - Perform the dynamic loading in the __init__ method 1039 - Override to_dict() and from_dict() methods if your tools are defined dynamically 1040 - Serialize endpoint descriptors rather than tool instances if your tools 1041 are loaded from external sources 1042 1043 #### warm_up 1044 1045 ```python 1046 warm_up() -> None 1047 ``` 1048 1049 Prepare the Toolset for use. 1050 1051 By default, this method iterates through and warms up all tools in the Toolset. 1052 Subclasses can override this method to customize initialization behavior, such as: 1053 1054 - Setting up shared resources (database connections, HTTP sessions) instead of 1055 warming individual tools 1056 - Implementing custom initialization logic for dynamically loaded tools 1057 - Controlling when and how tools are initialized 1058 1059 For example, a Toolset that manages tools from an external service (like MCPToolset) 1060 might override this to initialize a shared connection rather than warming up 1061 individual tools: 1062 1063 ```python 1064 class MCPToolset(Toolset): 1065 def warm_up(self) -> None: 1066 # Only warm up the shared MCP connection, not individual tools 1067 self.mcp_connection = establish_connection(self.server_url) 1068 ``` 1069 1070 This method should be idempotent, as it may be called multiple times. 1071 1072 #### add 1073 1074 ```python 1075 add(tool: Union[Tool, Toolset]) -> None 1076 ``` 1077 1078 Add a new Tool or merge another Toolset. 1079 1080 **Parameters:** 1081 1082 - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add 1083 1084 **Raises:** 1085 1086 - <code>ValueError</code> – If adding the tool would result in duplicate tool names 1087 - <code>TypeError</code> – If the provided object is not a Tool or Toolset 1088 1089 #### to_dict 1090 1091 ```python 1092 to_dict() -> dict[str, Any] 1093 ``` 1094 1095 Serialize the Toolset to a dictionary. 1096 1097 **Returns:** 1098 1099 - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset 1100 1101 Note for subclass implementers: 1102 The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass 1103 of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or 1104 a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool 1105 instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead 1106 associated with serializing potentially large collections of Tool objects. Moreover, by serializing the 1107 descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even 1108 if they have been modified or removed since the last serialization. Failing to serialize the descriptor may 1109 lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or 1110 unexpected behavior. 1111 1112 #### from_dict 1113 1114 ```python 1115 from_dict(data: dict[str, Any]) -> Toolset 1116 ``` 1117 1118 Deserialize a Toolset from a dictionary. 1119 1120 **Parameters:** 1121 1122 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset 1123 1124 **Returns:** 1125 1126 - <code>Toolset</code> – A new Toolset instance