tools_api.md
1 --- 2 title: "Tools" 3 id: tools-api 4 description: "Unified abstractions to represent tools across the framework." 5 slug: "/tools-api" 6 --- 7 8 9 ## component_tool 10 11 ### ComponentTool 12 13 Bases: <code>Tool</code> 14 15 A Tool that wraps Haystack components, allowing them to be used as tools by LLMs. 16 17 ComponentTool automatically generates LLM-compatible tool schemas from component input sockets, 18 which are derived from the component's `run` method signature and type hints. 19 20 Key features: 21 22 - Automatic LLM tool calling schema generation from component input sockets 23 - Type conversion and validation for component inputs 24 - Support for types: 25 - Dataclasses 26 - Lists of dataclasses 27 - Basic types (str, int, float, bool, dict) 28 - Lists of basic types 29 - Automatic name generation from component class name 30 - Description extraction from component docstrings 31 32 To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create. 33 You can create a ComponentTool from the component by passing the component to the ComponentTool constructor. 34 Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component. 35 36 ## Usage Example: 37 38 <!-- test-ignore --> 39 40 ```python 41 from haystack import component, Pipeline 42 from haystack.tools import ComponentTool 43 from haystack.components.websearch import SerperDevWebSearch 44 from haystack.utils import Secret 45 from haystack.components.tools.tool_invoker import ToolInvoker 46 from haystack.components.generators.chat import OpenAIChatGenerator 47 from haystack.dataclasses import ChatMessage 48 49 # Create a SerperDev search component 50 search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3) 51 52 # Create a tool from the component 53 tool = ComponentTool( 54 component=search, 55 name="web_search", # Optional: defaults to "serper_dev_web_search" 56 description="Search the web for current information on any topic" # Optional: defaults to component docstring 57 ) 58 59 # Create pipeline with OpenAIChatGenerator and ToolInvoker 60 pipeline = Pipeline() 61 pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool])) 62 pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool])) 63 64 # Connect components 65 pipeline.connect("llm.replies", "tool_invoker.messages") 66 67 message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla") 68 69 # Run pipeline 70 result = pipeline.run({"llm": {"messages": [message]}}) 71 72 print(result) 73 ``` 74 75 #### __init__ 76 77 ```python 78 __init__( 79 component: Component, 80 name: str | None = None, 81 description: str | None = None, 82 parameters: dict[str, Any] | None = None, 83 *, 84 outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None, 85 inputs_from_state: dict[str, str] | None = None, 86 outputs_to_state: dict[str, dict[str, str | Callable]] | None = None 87 ) -> None 88 ``` 89 90 Create a Tool instance from a Haystack component. 91 92 **Parameters:** 93 94 - **component** (<code>Component</code>) – The Haystack component to wrap as a tool. 95 - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name). 96 - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring). 97 - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool. 98 Will fall back to the parameters defined in the component's run method signature if not provided. 99 - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 100 If not provided, the tool result is converted to a string using a default handler. 101 102 `outputs_to_string` supports two formats: 103 104 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 105 106 ```python 107 { 108 "source": "docs", "handler": format_documents, "raw_result": False 109 } 110 ``` 111 112 - `source`: If provided, only the specified output key is sent to the handler. 113 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 114 final result. 115 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the 116 `handler` if provided. This is intended for tools that return images. In this mode, the Tool 117 function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to 118 ensure compatibility with Chat Generators. 119 120 1. Multiple output format - map keys to individual configurations: 121 122 ```python 123 { 124 "formatted_docs": {"source": "docs", "handler": format_documents}, 125 "summary": {"source": "summary_text", "handler": str.upper} 126 } 127 ``` 128 129 Each key maps to a dictionary that can contain "source" and/or "handler". 130 Note that `raw_result` is not supported in the multiple output format. 131 132 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 133 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 134 - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 135 If the source is provided only the specified output key is sent to the handler. 136 Example: 137 138 ```python 139 { 140 "documents": {"source": "docs", "handler": custom_handler} 141 } 142 ``` 143 144 If the source is omitted the whole tool result is sent to the handler. 145 Example: 146 147 ```python 148 { 149 "documents": {"handler": custom_handler} 150 } 151 ``` 152 153 **Raises:** 154 155 - <code>TypeError</code> – If the object passed is not a Haystack Component instance. 156 - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails. 157 158 #### warm_up 159 160 ```python 161 warm_up() -> None 162 ``` 163 164 Prepare the ComponentTool for use. 165 166 #### to_dict 167 168 ```python 169 to_dict() -> dict[str, Any] 170 ``` 171 172 Serializes the ComponentTool to a dictionary. 173 174 #### from_dict 175 176 ```python 177 from_dict(data: dict[str, Any]) -> ComponentTool 178 ``` 179 180 Deserializes the ComponentTool from a dictionary. 181 182 ## from_function 183 184 ### create_tool_from_function 185 186 ```python 187 create_tool_from_function( 188 function: Callable, 189 name: str | None = None, 190 description: str | None = None, 191 inputs_from_state: dict[str, str] | None = None, 192 outputs_to_state: dict[str, dict[str, Any]] | None = None, 193 outputs_to_string: dict[str, Any] | None = None, 194 ) -> Tool 195 ``` 196 197 Create a Tool instance from a function. 198 199 Allows customizing the Tool name and description. 200 For simpler use cases, consider using the `@tool` decorator. 201 202 ### Usage example 203 204 ```python 205 from typing import Annotated, Literal 206 from haystack.tools import create_tool_from_function 207 208 def get_weather( 209 city: Annotated[str, "the city for which to get the weather"] = "Munich", 210 unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"): 211 '''A simple function to get the current weather for a location.''' 212 return f"Weather report for {city}: 20 {unit}, sunny" 213 214 tool = create_tool_from_function(get_weather) 215 216 print(tool) 217 # >> Tool(name='get_weather', description='A simple function to get the current weather for a location.', 218 # >> parameters={ 219 # >> 'type': 'object', 220 # >> 'properties': { 221 # >> 'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'}, 222 # >> 'unit': { 223 # >> 'type': 'string', 224 # >> 'enum': ['Celsius', 'Fahrenheit'], 225 # >> 'description': 'the unit for the temperature', 226 # >> 'default': 'Celsius', 227 # >> }, 228 # >> } 229 # >> }, 230 # >> function=<function get_weather at 0x7f7b3a8a9b80>) 231 ``` 232 233 **Parameters:** 234 235 - **function** (<code>Callable</code>) – The function to be converted into a Tool. 236 The function must include type hints for all parameters. 237 The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple). 238 Other input types may work but are not guaranteed. 239 If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description. 240 - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used. 241 - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used. 242 To intentionally leave the description empty, pass an empty string. 243 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 244 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 245 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 246 If the source is provided only the specified output key is sent to the handler. 247 Example: 248 249 ```python 250 { 251 "documents": {"source": "docs", "handler": custom_handler} 252 } 253 ``` 254 255 If the source is omitted the whole tool result is sent to the handler. 256 Example: 257 258 ```python 259 { 260 "documents": {"handler": custom_handler} 261 } 262 ``` 263 264 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 265 If not provided, the tool result is converted to a string using a default handler. 266 267 `outputs_to_string` supports two formats: 268 269 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 270 271 ```python 272 { 273 "source": "docs", "handler": format_documents, "raw_result": False 274 } 275 ``` 276 277 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 278 tool result is sent to the handler. 279 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 280 final result. 281 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 282 if provided. This is intended for tools that return images. In this mode, the Tool function or the 283 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 284 Generators. 285 286 1. Multiple output format - map keys to individual configurations: 287 288 ```python 289 { 290 "formatted_docs": {"source": "docs", "handler": format_documents}, 291 "summary": {"source": "summary_text", "handler": str.upper} 292 } 293 ``` 294 295 Each key maps to a dictionary that can contain "source" and/or "handler". 296 Note that `raw_result` is not supported in the multiple output format. 297 298 **Returns:** 299 300 - <code>Tool</code> – The Tool created from the function. 301 302 **Raises:** 303 304 - <code>ValueError</code> – If any parameter of the function lacks a type hint. 305 - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool. 306 307 ### tool 308 309 ```python 310 tool( 311 function: Callable | None = None, 312 *, 313 name: str | None = None, 314 description: str | None = None, 315 inputs_from_state: dict[str, str] | None = None, 316 outputs_to_state: dict[str, dict[str, Any]] | None = None, 317 outputs_to_string: dict[str, Any] | None = None 318 ) -> Tool | Callable[[Callable], Tool] 319 ``` 320 321 Decorator to convert a function into a Tool. 322 323 Can be used with or without parameters: 324 @tool # without parameters 325 def my_function(): ... 326 327 @tool(name="custom_name") # with parameters 328 def my_function(): ... 329 330 ### Usage example 331 332 ```python 333 from typing import Annotated, Literal 334 from haystack.tools import tool 335 336 @tool 337 def get_weather( 338 city: Annotated[str, "the city for which to get the weather"] = "Munich", 339 unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"): 340 '''A simple function to get the current weather for a location.''' 341 return f"Weather report for {city}: 20 {unit}, sunny" 342 343 print(get_weather) 344 # >> Tool(name='get_weather', description='A simple function to get the current weather for a location.', 345 # >> parameters={ 346 # >> 'type': 'object', 347 # >> 'properties': { 348 # >> 'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'}, 349 # >> 'unit': { 350 # >> 'type': 'string', 351 # >> 'enum': ['Celsius', 'Fahrenheit'], 352 # >> 'description': 'the unit for the temperature', 353 # >> 'default': 'Celsius', 354 # >> }, 355 # >> } 356 # >> }, 357 # >> function=<function get_weather at 0x7f7b3a8a9b80>) 358 ``` 359 360 **Parameters:** 361 362 - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters) 363 - **name** (<code>str | None</code>) – Optional custom name for the tool 364 - **description** (<code>str | None</code>) – Optional custom description 365 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 366 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 367 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 368 If the source is provided only the specified output key is sent to the handler. 369 Example: 370 371 ```python 372 { 373 "documents": {"source": "docs", "handler": custom_handler} 374 } 375 ``` 376 377 If the source is omitted the whole tool result is sent to the handler. 378 Example: 379 380 ```python 381 { 382 "documents": {"handler": custom_handler} 383 } 384 ``` 385 386 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 387 If not provided, the tool result is converted to a string using a default handler. 388 389 `outputs_to_string` supports two formats: 390 391 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 392 393 ```python 394 { 395 "source": "docs", "handler": format_documents, "raw_result": False 396 } 397 ``` 398 399 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 400 tool result is sent to the handler. 401 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 402 final result. 403 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 404 if provided. This is intended for tools that return images. In this mode, the Tool function or the 405 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 406 Generators. 407 408 1. Multiple output format - map keys to individual configurations: 409 410 ```python 411 { 412 "formatted_docs": {"source": "docs", "handler": format_documents}, 413 "summary": {"source": "summary_text", "handler": str.upper} 414 } 415 ``` 416 417 Each key maps to a dictionary that can contain "source" and/or "handler". 418 Note that `raw_result` is not supported in the multiple output format. 419 420 **Returns:** 421 422 - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one 423 424 ## pipeline_tool 425 426 ### PipelineTool 427 428 Bases: <code>ComponentTool</code> 429 430 A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs. 431 432 PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets, 433 which are derived from the underlying components in the pipeline. 434 435 Key features: 436 437 - Automatic LLM tool calling schema generation from pipeline inputs 438 - Description extraction of pipeline inputs based on the underlying component docstrings 439 440 To use PipelineTool, you first need a Haystack pipeline. 441 Below is an example of creating a PipelineTool 442 443 ## Usage Example: 444 445 ```python 446 from haystack import Document, Pipeline 447 from haystack.dataclasses import ChatMessage 448 from haystack.document_stores.in_memory import InMemoryDocumentStore 449 from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder 450 from haystack.components.embedders.sentence_transformers_document_embedder import ( 451 SentenceTransformersDocumentEmbedder 452 ) 453 from haystack.components.generators.chat import OpenAIChatGenerator 454 from haystack.components.retrievers import InMemoryEmbeddingRetriever 455 from haystack.components.agents import Agent 456 from haystack.tools import PipelineTool 457 458 # Initialize a document store and add some documents 459 document_store = InMemoryDocumentStore() 460 document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2") 461 documents = [ 462 Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."), 463 Document( 464 content="He is best known for his contributions to the design of the modern alternating current (AC) " 465 "electricity supply system." 466 ), 467 ] 468 docs_with_embeddings = document_embedder.run(documents=documents)["documents"] 469 document_store.write_documents(docs_with_embeddings) 470 471 # Build a simple retrieval pipeline 472 retrieval_pipeline = Pipeline() 473 retrieval_pipeline.add_component( 474 "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2") 475 ) 476 retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store)) 477 478 retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding") 479 480 # Wrap the pipeline as a tool 481 retriever_tool = PipelineTool( 482 pipeline=retrieval_pipeline, 483 input_mapping={"query": ["embedder.text"]}, 484 output_mapping={"retriever.documents": "documents"}, 485 name="document_retriever", 486 description="For any questions about Nikola Tesla, always use this tool", 487 ) 488 489 # Create an Agent with the tool 490 agent = Agent( 491 chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"), 492 tools=[retriever_tool] 493 ) 494 495 # Let the Agent handle a query 496 result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")]) 497 498 # Print result of the tool call 499 print("Tool Call Result:") 500 print(result["messages"][2].tool_call_result.result) 501 print("") 502 503 # Print answer 504 print("Answer:") 505 print(result["messages"][-1].text) 506 ``` 507 508 #### __init__ 509 510 ```python 511 __init__( 512 pipeline: Pipeline | AsyncPipeline, 513 *, 514 name: str, 515 description: str, 516 input_mapping: dict[str, list[str]] | None = None, 517 output_mapping: dict[str, str] | None = None, 518 parameters: dict[str, Any] | None = None, 519 outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None, 520 inputs_from_state: dict[str, str] | None = None, 521 outputs_to_state: dict[str, dict[str, str | Callable]] | None = None 522 ) -> None 523 ``` 524 525 Create a Tool instance from a Haystack pipeline. 526 527 **Parameters:** 528 529 - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool. 530 - **name** (<code>str</code>) – Name of the tool. 531 - **description** (<code>str</code>) – Description of the tool. 532 - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths. 533 If not provided, a default input mapping will be created based on all pipeline inputs. 534 Example: 535 536 ```python 537 input_mapping={ 538 "query": ["retriever.query", "prompt_builder.query"], 539 } 540 ``` 541 542 - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names. 543 If not provided, a default output mapping will be created based on all pipeline outputs. 544 Example: 545 546 ```python 547 output_mapping={ 548 "retriever.documents": "documents", 549 "generator.replies": "replies", 550 } 551 ``` 552 553 - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool. 554 Will fall back to the parameters defined in the component's run method signature if not provided. 555 - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 556 If not provided, the tool result is converted to a string using a default handler. 557 558 `outputs_to_string` supports two formats: 559 560 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 561 562 ```python 563 { 564 "source": "docs", "handler": format_documents, "raw_result": False 565 } 566 ``` 567 568 - `source`: If provided, only the specified output key is sent to the handler. 569 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 570 final result. 571 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the 572 `handler` if provided. This is intended for tools that return images. In this mode, the Tool 573 function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to 574 ensure compatibility with Chat Generators. 575 576 1. Multiple output format - map keys to individual configurations: 577 578 ```python 579 { 580 "formatted_docs": {"source": "docs", "handler": format_documents}, 581 "summary": {"source": "summary_text", "handler": str.upper} 582 } 583 ``` 584 585 Each key maps to a dictionary that can contain "source" and/or "handler". 586 Note that `raw_result` is not supported in the multiple output format. 587 588 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 589 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 590 - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 591 If the source is provided only the specified output key is sent to the handler. 592 Example: 593 594 ```python 595 { 596 "documents": {"source": "docs", "handler": custom_handler} 597 } 598 ``` 599 600 If the source is omitted the whole tool result is sent to the handler. 601 Example: 602 603 ```python 604 { 605 "documents": {"handler": custom_handler} 606 } 607 ``` 608 609 **Raises:** 610 611 - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance. 612 613 #### to_dict 614 615 ```python 616 to_dict() -> dict[str, Any] 617 ``` 618 619 Serializes the PipelineTool to a dictionary. 620 621 **Returns:** 622 623 - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool. 624 625 #### from_dict 626 627 ```python 628 from_dict(data: dict[str, Any]) -> PipelineTool 629 ``` 630 631 Deserializes the PipelineTool from a dictionary. 632 633 **Parameters:** 634 635 - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool. 636 637 **Returns:** 638 639 - <code>PipelineTool</code> – The deserialized PipelineTool instance. 640 641 ## searchable_toolset 642 643 ### SearchableToolset 644 645 Bases: <code>Toolset</code> 646 647 Dynamic tool discovery from large catalogs using BM25 search. 648 649 This Toolset enables LLMs to discover and use tools from large catalogs through 650 BM25-based search. Instead of exposing all tools at once (which can overwhelm the 651 LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to 652 find and load specific tools as needed. 653 654 For very small catalogs (below `search_threshold`), acts as a simple passthrough 655 exposing all tools directly without any discovery mechanism. 656 657 ### Usage Example 658 659 ```python 660 from haystack.components.agents import Agent 661 from haystack.components.generators.chat import OpenAIChatGenerator 662 from haystack.dataclasses import ChatMessage 663 from haystack.tools import Tool, SearchableToolset 664 665 # Create a catalog of tools 666 catalog = [ 667 Tool(name="get_weather", description="Get weather for a city", 668 parameters={}, function=lambda: None), 669 Tool(name="search_web", description="Search the web", 670 parameters={}, function=lambda: None), 671 # ... 100s more tools 672 ] 673 toolset = SearchableToolset(catalog=catalog) 674 675 agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset) 676 677 # The agent is initially provided only with the search_tools tool and will use it to find relevant tools. 678 result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")]) 679 ``` 680 681 #### __init__ 682 683 ```python 684 __init__( 685 catalog: ToolsType, 686 *, 687 top_k: int = 3, 688 search_threshold: int = 8, 689 search_tool_name: str = "search_tools", 690 search_tool_description: str | None = None, 691 search_tool_parameters_description: dict[str, str] | None = None 692 ) -> None 693 ``` 694 695 Initialize the SearchableToolset. 696 697 **Parameters:** 698 699 - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset. 700 - **top_k** (<code>int</code>) – Default number of results for search_tools. 701 - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search. 702 If catalog has fewer tools, acts as passthrough (all tools visible). 703 Default is 8. 704 - **search_tool_name** (<code>str</code>) – Custom name for the bootstrap search tool. Default is "search_tools". 705 - **search_tool_description** (<code>str | None</code>) – Custom description for the bootstrap search tool. 706 If not provided, uses a default description. 707 - **search_tool_parameters_description** (<code>dict\[str, str\] | None</code>) – Custom descriptions for the bootstrap search tool's parameters. 708 Keys must be a subset of `{"tool_keywords", "k"}`. 709 Example: `{"tool_keywords": "Keywords to find tools, e.g. 'email send'"}` 710 711 #### add 712 713 ```python 714 add(tool: Tool | Toolset) -> None 715 ``` 716 717 Adding new tools after initialization is not supported for SearchableToolset. 718 719 #### warm_up 720 721 ```python 722 warm_up() -> None 723 ``` 724 725 Prepare the toolset for use. 726 727 Warms up child toolsets first (so lazy toolsets like MCPToolset can connect), 728 then flattens the catalog, indexes it, and creates the search_tools bootstrap tool. 729 In passthrough mode, it warms up all catalog tools directly. 730 Must be called before using the toolset with an Agent. 731 732 #### clear 733 734 ```python 735 clear() -> None 736 ``` 737 738 Clear all discovered tools. 739 740 This method allows resetting the toolset's discovered tools between agent runs 741 when the same toolset instance is reused. This can be useful for long-running 742 applications to control memory usage or to start fresh searches. 743 744 #### to_dict 745 746 ```python 747 to_dict() -> dict[str, Any] 748 ``` 749 750 Serialize the toolset to a dictionary. 751 752 **Returns:** 753 754 - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset. 755 756 #### from_dict 757 758 ```python 759 from_dict(data: dict[str, Any]) -> SearchableToolset 760 ``` 761 762 Deserialize a toolset from a dictionary. 763 764 **Parameters:** 765 766 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset. 767 768 **Returns:** 769 770 - <code>SearchableToolset</code> – New SearchableToolset instance. 771 772 ## tool 773 774 ### Tool 775 776 Data class representing a Tool that Language Models can prepare a call for. 777 778 Accurate definitions of the textual attributes such as `name` and `description` 779 are important for the Language Model to correctly prepare the call. 780 781 For resource-intensive operations like establishing connections to remote services or 782 loading models, override the `warm_up()` method. This method is called before the Tool 783 is used and should be idempotent, as it may be called multiple times during 784 pipeline/agent setup. 785 786 **Parameters:** 787 788 - **name** (<code>str</code>) – Name of the Tool. 789 - **description** (<code>str</code>) – Description of the Tool. 790 - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool. 791 - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called. 792 Must be a synchronous function; async functions are not supported. 793 - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results. 794 If not provided, the tool result is converted to a string using a default handler. 795 796 `outputs_to_string` supports two formats: 797 798 1. Single output format - use "source", "handler", and/or "raw_result" at the root level: 799 800 ```python 801 { 802 "source": "docs", "handler": format_documents, "raw_result": False 803 } 804 ``` 805 806 - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole 807 tool result is sent to the handler. 808 - `handler`: A function that takes the tool output (or the extracted source value) and returns the 809 final result. 810 - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler` 811 if provided. This is intended for tools that return images. In this mode, the Tool function or the 812 `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat 813 Generators. 814 815 1. Multiple output format - map keys to individual configurations: 816 817 ```python 818 { 819 "formatted_docs": {"source": "docs", "handler": format_documents}, 820 "summary": {"source": "summary_text", "handler": str.upper} 821 } 822 ``` 823 824 Each key maps to a dictionary that can contain "source" and/or "handler". 825 Note that `raw_result` is not supported in the multiple output format. 826 827 - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names. 828 Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter. 829 - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers. 830 If the source is provided only the specified output key is sent to the handler. 831 Example: 832 833 ```python 834 { 835 "documents": {"source": "docs", "handler": custom_handler} 836 } 837 ``` 838 839 If the source is omitted the whole tool result is sent to the handler. 840 Example: 841 842 ```python 843 { 844 "documents": {"handler": custom_handler} 845 } 846 ``` 847 848 **Raises:** 849 850 - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the 851 `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid. 852 - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or 853 `inputs_from_state` has the wrong type. 854 855 #### tool_spec 856 857 ```python 858 tool_spec: dict[str, Any] 859 ``` 860 861 Return the Tool specification to be used by the Language Model. 862 863 #### warm_up 864 865 ```python 866 warm_up() -> None 867 ``` 868 869 Prepare the Tool for use. 870 871 Override this method to establish connections to remote services, load models, 872 or perform other resource-intensive initialization. This method should be idempotent, 873 as it may be called multiple times. 874 875 #### invoke 876 877 ```python 878 invoke(**kwargs: Any) -> Any 879 ``` 880 881 Invoke the Tool with the provided keyword arguments. 882 883 #### to_dict 884 885 ```python 886 to_dict() -> dict[str, Any] 887 ``` 888 889 Serializes the Tool to a dictionary. 890 891 **Returns:** 892 893 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 894 895 #### from_dict 896 897 ```python 898 from_dict(data: dict[str, Any]) -> Tool 899 ``` 900 901 Deserializes the Tool from a dictionary. 902 903 **Parameters:** 904 905 - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from. 906 907 **Returns:** 908 909 - <code>Tool</code> – Deserialized Tool. 910 911 ## toolset 912 913 ### Toolset 914 915 A collection of related Tools that can be used and managed as a cohesive unit. 916 917 Toolset serves two main purposes: 918 919 1. Group related tools together: 920 Toolset allows you to organize related tools into a single collection, making it easier 921 to manage and use them as a unit in Haystack pipelines. 922 923 Example: 924 925 ```python 926 from haystack.tools import Tool, Toolset 927 from haystack.components.tools import ToolInvoker 928 929 # Define math functions 930 def add_numbers(a: int, b: int) -> int: 931 return a + b 932 933 def subtract_numbers(a: int, b: int) -> int: 934 return a - b 935 936 # Create tools with proper schemas 937 add_tool = Tool( 938 name="add", 939 description="Add two numbers", 940 parameters={ 941 "type": "object", 942 "properties": { 943 "a": {"type": "integer"}, 944 "b": {"type": "integer"} 945 }, 946 "required": ["a", "b"] 947 }, 948 function=add_numbers 949 ) 950 951 subtract_tool = Tool( 952 name="subtract", 953 description="Subtract b from a", 954 parameters={ 955 "type": "object", 956 "properties": { 957 "a": {"type": "integer"}, 958 "b": {"type": "integer"} 959 }, 960 "required": ["a", "b"] 961 }, 962 function=subtract_numbers 963 ) 964 965 # Create a toolset with the math tools 966 math_toolset = Toolset([add_tool, subtract_tool]) 967 968 # Use the toolset with a ToolInvoker or ChatGenerator component 969 invoker = ToolInvoker(tools=math_toolset) 970 ``` 971 972 2. Base class for dynamic tool loading: 973 By subclassing Toolset, you can create implementations that dynamically load tools 974 from external sources like OpenAPI URLs, MCP servers, or other resources. 975 976 Example: 977 978 ```python 979 from haystack.core.serialization import generate_qualified_class_name 980 from haystack.tools import Tool, Toolset 981 from haystack.components.tools import ToolInvoker 982 983 class CalculatorToolset(Toolset): 984 '''A toolset for calculator operations.''' 985 986 def __init__(self) -> None: 987 tools = self._create_tools() 988 super().__init__(tools) 989 990 def _create_tools(self): 991 # These Tool instances are obviously defined statically and for illustration purposes only. 992 # In a real-world scenario, you would dynamically load tools from an external source here. 993 tools = [] 994 add_tool = Tool( 995 name="add", 996 description="Add two numbers", 997 parameters={ 998 "type": "object", 999 "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, 1000 "required": ["a", "b"], 1001 }, 1002 function=lambda a, b: a + b, 1003 ) 1004 1005 multiply_tool = Tool( 1006 name="multiply", 1007 description="Multiply two numbers", 1008 parameters={ 1009 "type": "object", 1010 "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, 1011 "required": ["a", "b"], 1012 }, 1013 function=lambda a, b: a * b, 1014 ) 1015 1016 tools.append(add_tool) 1017 tools.append(multiply_tool) 1018 1019 return tools 1020 1021 def to_dict(self): 1022 return { 1023 "type": generate_qualified_class_name(type(self)), 1024 "data": {}, # no data to serialize as we define the tools dynamically 1025 } 1026 1027 @classmethod 1028 def from_dict(cls, data): 1029 return cls() # Recreate the tools dynamically during deserialization 1030 1031 # Create the dynamic toolset and use it with ToolInvoker 1032 calculator_toolset = CalculatorToolset() 1033 invoker = ToolInvoker(tools=calculator_toolset) 1034 ``` 1035 1036 Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__), 1037 making it behave like a list of Tools. This makes it compatible with components that expect 1038 iterable tools, such as ToolInvoker or Haystack chat generators. 1039 1040 When implementing a custom Toolset subclass for dynamic tool loading: 1041 1042 - Perform the dynamic loading in the __init__ method 1043 - Override to_dict() and from_dict() methods if your tools are defined dynamically 1044 - Serialize endpoint descriptors rather than tool instances if your tools 1045 are loaded from external sources 1046 1047 #### warm_up 1048 1049 ```python 1050 warm_up() -> None 1051 ``` 1052 1053 Prepare the Toolset for use. 1054 1055 By default, this method iterates through and warms up all tools in the Toolset. 1056 Subclasses can override this method to customize initialization behavior, such as: 1057 1058 - Setting up shared resources (database connections, HTTP sessions) instead of 1059 warming individual tools 1060 - Implementing custom initialization logic for dynamically loaded tools 1061 - Controlling when and how tools are initialized 1062 1063 For example, a Toolset that manages tools from an external service (like MCPToolset) 1064 might override this to initialize a shared connection rather than warming up 1065 individual tools: 1066 1067 ```python 1068 class MCPToolset(Toolset): 1069 def warm_up(self) -> None: 1070 # Only warm up the shared MCP connection, not individual tools 1071 self.mcp_connection = establish_connection(self.server_url) 1072 ``` 1073 1074 This method should be idempotent, as it may be called multiple times. 1075 1076 #### add 1077 1078 ```python 1079 add(tool: Union[Tool, Toolset]) -> None 1080 ``` 1081 1082 Add a new Tool or merge another Toolset. 1083 1084 **Parameters:** 1085 1086 - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add 1087 1088 **Raises:** 1089 1090 - <code>ValueError</code> – If adding the tool would result in duplicate tool names 1091 - <code>TypeError</code> – If the provided object is not a Tool or Toolset 1092 1093 #### to_dict 1094 1095 ```python 1096 to_dict() -> dict[str, Any] 1097 ``` 1098 1099 Serialize the Toolset to a dictionary. 1100 1101 **Returns:** 1102 1103 - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset 1104 1105 Note for subclass implementers: 1106 The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass 1107 of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or 1108 a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool 1109 instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead 1110 associated with serializing potentially large collections of Tool objects. Moreover, by serializing the 1111 descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even 1112 if they have been modified or removed since the last serialization. Failing to serialize the descriptor may 1113 lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or 1114 unexpected behavior. 1115 1116 #### from_dict 1117 1118 ```python 1119 from_dict(data: dict[str, Any]) -> Toolset 1120 ``` 1121 1122 Deserialize a Toolset from a dictionary. 1123 1124 **Parameters:** 1125 1126 - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset 1127 1128 **Returns:** 1129 1130 - <code>Toolset</code> – A new Toolset instance