Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.28 / haystack-api / tools_api.md
tools_api.md
   1  ---
   2  title: "Tools"
   3  id: tools-api
   4  description: "Unified abstractions to represent tools across the framework."
   5  slug: "/tools-api"
   6  ---
   7  
   8  
   9  ## component_tool
  10  
  11  ### ComponentTool
  12  
  13  Bases: <code>Tool</code>
  14  
  15  A Tool that wraps Haystack components, allowing them to be used as tools by LLMs.
  16  
  17  ComponentTool automatically generates LLM-compatible tool schemas from component input sockets,
  18  which are derived from the component's `run` method signature and type hints.
  19  
  20  Key features:
  21  
  22  - Automatic LLM tool calling schema generation from component input sockets
  23  - Type conversion and validation for component inputs
  24  - Support for types:
  25    - Dataclasses
  26    - Lists of dataclasses
  27    - Basic types (str, int, float, bool, dict)
  28    - Lists of basic types
  29  - Automatic name generation from component class name
  30  - Description extraction from component docstrings
  31  
  32  To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create.
  33  You can create a ComponentTool from the component by passing the component to the ComponentTool constructor.
  34  Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component.
  35  
  36  ## Usage Example:
  37  
  38  <!-- test-ignore -->
  39  
  40  ```python
  41  from haystack import component, Pipeline
  42  from haystack.tools import ComponentTool
  43  from haystack.components.websearch import SerperDevWebSearch
  44  from haystack.utils import Secret
  45  from haystack.components.tools.tool_invoker import ToolInvoker
  46  from haystack.components.generators.chat import OpenAIChatGenerator
  47  from haystack.dataclasses import ChatMessage
  48  
  49  # Create a SerperDev search component
  50  search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3)
  51  
  52  # Create a tool from the component
  53  tool = ComponentTool(
  54      component=search,
  55      name="web_search",  # Optional: defaults to "serper_dev_web_search"
  56      description="Search the web for current information on any topic"  # Optional: defaults to component docstring
  57  )
  58  
  59  # Create pipeline with OpenAIChatGenerator and ToolInvoker
  60  pipeline = Pipeline()
  61  pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool]))
  62  pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool]))
  63  
  64  # Connect components
  65  pipeline.connect("llm.replies", "tool_invoker.messages")
  66  
  67  message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla")
  68  
  69  # Run pipeline
  70  result = pipeline.run({"llm": {"messages": [message]}})
  71  
  72  print(result)
  73  ```
  74  
  75  #### __init__
  76  
  77  ```python
  78  __init__(
  79      component: Component,
  80      name: str | None = None,
  81      description: str | None = None,
  82      parameters: dict[str, Any] | None = None,
  83      *,
  84      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
  85      inputs_from_state: dict[str, str] | None = None,
  86      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
  87  ) -> None
  88  ```
  89  
  90  Create a Tool instance from a Haystack component.
  91  
  92  **Parameters:**
  93  
  94  - **component** (<code>Component</code>) – The Haystack component to wrap as a tool.
  95  - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name).
  96  - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring).
  97  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
  98    Will fall back to the parameters defined in the component's run method signature if not provided.
  99  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 100    If not provided, the tool result is converted to a string using a default handler.
 101  
 102  `outputs_to_string` supports two formats:
 103  
 104  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 105  
 106     ```python
 107     {
 108         "source": "docs", "handler": format_documents, "raw_result": False
 109     }
 110     ```
 111  
 112     - `source`: If provided, only the specified output key is sent to the handler.
 113     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 114       final result.
 115     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 116       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 117       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 118       ensure compatibility with Chat Generators.
 119  
 120  1. Multiple output format - map keys to individual configurations:
 121  
 122     ```python
 123     {
 124         "formatted_docs": {"source": "docs", "handler": format_documents},
 125         "summary": {"source": "summary_text", "handler": str.upper}
 126     }
 127     ```
 128  
 129     Each key maps to a dictionary that can contain "source" and/or "handler".
 130     Note that `raw_result` is not supported in the multiple output format.
 131  
 132  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 133    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 134  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 135    If the source is provided only the specified output key is sent to the handler.
 136    Example:
 137  
 138  ```python
 139  {
 140      "documents": {"source": "docs", "handler": custom_handler}
 141  }
 142  ```
 143  
 144  If the source is omitted the whole tool result is sent to the handler.
 145  Example:
 146  
 147  ```python
 148  {
 149      "documents": {"handler": custom_handler}
 150  }
 151  ```
 152  
 153  **Raises:**
 154  
 155  - <code>TypeError</code> – If the object passed is not a Haystack Component instance.
 156  - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails.
 157  
 158  #### warm_up
 159  
 160  ```python
 161  warm_up() -> None
 162  ```
 163  
 164  Prepare the ComponentTool for use.
 165  
 166  #### to_dict
 167  
 168  ```python
 169  to_dict() -> dict[str, Any]
 170  ```
 171  
 172  Serializes the ComponentTool to a dictionary.
 173  
 174  #### from_dict
 175  
 176  ```python
 177  from_dict(data: dict[str, Any]) -> ComponentTool
 178  ```
 179  
 180  Deserializes the ComponentTool from a dictionary.
 181  
 182  ## from_function
 183  
 184  ### create_tool_from_function
 185  
 186  ```python
 187  create_tool_from_function(
 188      function: Callable,
 189      name: str | None = None,
 190      description: str | None = None,
 191      inputs_from_state: dict[str, str] | None = None,
 192      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 193      outputs_to_string: dict[str, Any] | None = None,
 194  ) -> Tool
 195  ```
 196  
 197  Create a Tool instance from a function.
 198  
 199  Allows customizing the Tool name and description.
 200  For simpler use cases, consider using the `@tool` decorator.
 201  
 202  ### Usage example
 203  
 204  ```python
 205  from typing import Annotated, Literal
 206  from haystack.tools import create_tool_from_function
 207  
 208  def get_weather(
 209      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 210      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 211      '''A simple function to get the current weather for a location.'''
 212      return f"Weather report for {city}: 20 {unit}, sunny"
 213  
 214  tool = create_tool_from_function(get_weather)
 215  
 216  print(tool)
 217  # >> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 218  # >> parameters={
 219  # >> 'type': 'object',
 220  # >> 'properties': {
 221  # >>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 222  # >>     'unit': {
 223  # >>         'type': 'string',
 224  # >>         'enum': ['Celsius', 'Fahrenheit'],
 225  # >>         'description': 'the unit for the temperature',
 226  # >>         'default': 'Celsius',
 227  # >>     },
 228  # >>     }
 229  # >> },
 230  # >> function=<function get_weather at 0x7f7b3a8a9b80>)
 231  ```
 232  
 233  **Parameters:**
 234  
 235  - **function** (<code>Callable</code>) – The function to be converted into a Tool.
 236    The function must include type hints for all parameters.
 237    The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple).
 238    Other input types may work but are not guaranteed.
 239    If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description.
 240  - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used.
 241  - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used.
 242    To intentionally leave the description empty, pass an empty string.
 243  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 244    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 245  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 246    If the source is provided only the specified output key is sent to the handler.
 247    Example:
 248  
 249  ```python
 250  {
 251      "documents": {"source": "docs", "handler": custom_handler}
 252  }
 253  ```
 254  
 255  If the source is omitted the whole tool result is sent to the handler.
 256  Example:
 257  
 258  ```python
 259  {
 260      "documents": {"handler": custom_handler}
 261  }
 262  ```
 263  
 264  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 265    If not provided, the tool result is converted to a string using a default handler.
 266  
 267  `outputs_to_string` supports two formats:
 268  
 269  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 270  
 271     ```python
 272     {
 273         "source": "docs", "handler": format_documents, "raw_result": False
 274     }
 275     ```
 276  
 277     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 278       tool result is sent to the handler.
 279     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 280       final result.
 281     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 282       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 283       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 284       Generators.
 285  
 286  1. Multiple output format - map keys to individual configurations:
 287  
 288     ```python
 289     {
 290         "formatted_docs": {"source": "docs", "handler": format_documents},
 291         "summary": {"source": "summary_text", "handler": str.upper}
 292     }
 293     ```
 294  
 295     Each key maps to a dictionary that can contain "source" and/or "handler".
 296     Note that `raw_result` is not supported in the multiple output format.
 297  
 298  **Returns:**
 299  
 300  - <code>Tool</code> – The Tool created from the function.
 301  
 302  **Raises:**
 303  
 304  - <code>ValueError</code> – If any parameter of the function lacks a type hint.
 305  - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool.
 306  
 307  ### tool
 308  
 309  ```python
 310  tool(
 311      function: Callable | None = None,
 312      *,
 313      name: str | None = None,
 314      description: str | None = None,
 315      inputs_from_state: dict[str, str] | None = None,
 316      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 317      outputs_to_string: dict[str, Any] | None = None
 318  ) -> Tool | Callable[[Callable], Tool]
 319  ```
 320  
 321  Decorator to convert a function into a Tool.
 322  
 323  Can be used with or without parameters:
 324  @tool # without parameters
 325  def my_function(): ...
 326  
 327  @tool(name="custom_name") # with parameters
 328  def my_function(): ...
 329  
 330  ### Usage example
 331  
 332  ```python
 333  from typing import Annotated, Literal
 334  from haystack.tools import tool
 335  
 336  @tool
 337  def get_weather(
 338      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 339      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 340      '''A simple function to get the current weather for a location.'''
 341      return f"Weather report for {city}: 20 {unit}, sunny"
 342  
 343  print(get_weather)
 344  # >> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 345  # >> parameters={
 346  # >> 'type': 'object',
 347  # >> 'properties': {
 348  # >>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 349  # >>     'unit': {
 350  # >>         'type': 'string',
 351  # >>         'enum': ['Celsius', 'Fahrenheit'],
 352  # >>         'description': 'the unit for the temperature',
 353  # >>         'default': 'Celsius',
 354  # >>     },
 355  # >>     }
 356  # >> },
 357  # >> function=<function get_weather at 0x7f7b3a8a9b80>)
 358  ```
 359  
 360  **Parameters:**
 361  
 362  - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters)
 363  - **name** (<code>str | None</code>) – Optional custom name for the tool
 364  - **description** (<code>str | None</code>) – Optional custom description
 365  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 366    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 367  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 368    If the source is provided only the specified output key is sent to the handler.
 369    Example:
 370  
 371  ```python
 372  {
 373      "documents": {"source": "docs", "handler": custom_handler}
 374  }
 375  ```
 376  
 377  If the source is omitted the whole tool result is sent to the handler.
 378  Example:
 379  
 380  ```python
 381  {
 382      "documents": {"handler": custom_handler}
 383  }
 384  ```
 385  
 386  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 387    If not provided, the tool result is converted to a string using a default handler.
 388  
 389  `outputs_to_string` supports two formats:
 390  
 391  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 392  
 393     ```python
 394     {
 395         "source": "docs", "handler": format_documents, "raw_result": False
 396     }
 397     ```
 398  
 399     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 400       tool result is sent to the handler.
 401     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 402       final result.
 403     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 404       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 405       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 406       Generators.
 407  
 408  1. Multiple output format - map keys to individual configurations:
 409  
 410     ```python
 411     {
 412         "formatted_docs": {"source": "docs", "handler": format_documents},
 413         "summary": {"source": "summary_text", "handler": str.upper}
 414     }
 415     ```
 416  
 417     Each key maps to a dictionary that can contain "source" and/or "handler".
 418     Note that `raw_result` is not supported in the multiple output format.
 419  
 420  **Returns:**
 421  
 422  - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one
 423  
 424  ## pipeline_tool
 425  
 426  ### PipelineTool
 427  
 428  Bases: <code>ComponentTool</code>
 429  
 430  A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs.
 431  
 432  PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets,
 433  which are derived from the underlying components in the pipeline.
 434  
 435  Key features:
 436  
 437  - Automatic LLM tool calling schema generation from pipeline inputs
 438  - Description extraction of pipeline inputs based on the underlying component docstrings
 439  
 440  To use PipelineTool, you first need a Haystack pipeline.
 441  Below is an example of creating a PipelineTool
 442  
 443  ## Usage Example:
 444  
 445  ```python
 446  from haystack import Document, Pipeline
 447  from haystack.dataclasses import ChatMessage
 448  from haystack.document_stores.in_memory import InMemoryDocumentStore
 449  from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder
 450  from haystack.components.embedders.sentence_transformers_document_embedder import (
 451      SentenceTransformersDocumentEmbedder
 452  )
 453  from haystack.components.generators.chat import OpenAIChatGenerator
 454  from haystack.components.retrievers import InMemoryEmbeddingRetriever
 455  from haystack.components.agents import Agent
 456  from haystack.tools import PipelineTool
 457  
 458  # Initialize a document store and add some documents
 459  document_store = InMemoryDocumentStore()
 460  document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 461  documents = [
 462      Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."),
 463      Document(
 464          content="He is best known for his contributions to the design of the modern alternating current (AC) "
 465                  "electricity supply system."
 466      ),
 467  ]
 468  docs_with_embeddings = document_embedder.run(documents=documents)["documents"]
 469  document_store.write_documents(docs_with_embeddings)
 470  
 471  # Build a simple retrieval pipeline
 472  retrieval_pipeline = Pipeline()
 473  retrieval_pipeline.add_component(
 474      "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 475  )
 476  retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
 477  
 478  retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding")
 479  
 480  # Wrap the pipeline as a tool
 481  retriever_tool = PipelineTool(
 482      pipeline=retrieval_pipeline,
 483      input_mapping={"query": ["embedder.text"]},
 484      output_mapping={"retriever.documents": "documents"},
 485      name="document_retriever",
 486      description="For any questions about Nikola Tesla, always use this tool",
 487  )
 488  
 489  # Create an Agent with the tool
 490  agent = Agent(
 491      chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
 492      tools=[retriever_tool]
 493  )
 494  
 495  # Let the Agent handle a query
 496  result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")])
 497  
 498  # Print result of the tool call
 499  print("Tool Call Result:")
 500  print(result["messages"][2].tool_call_result.result)
 501  print("")
 502  
 503  # Print answer
 504  print("Answer:")
 505  print(result["messages"][-1].text)
 506  ```
 507  
 508  #### __init__
 509  
 510  ```python
 511  __init__(
 512      pipeline: Pipeline | AsyncPipeline,
 513      *,
 514      name: str,
 515      description: str,
 516      input_mapping: dict[str, list[str]] | None = None,
 517      output_mapping: dict[str, str] | None = None,
 518      parameters: dict[str, Any] | None = None,
 519      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
 520      inputs_from_state: dict[str, str] | None = None,
 521      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
 522  ) -> None
 523  ```
 524  
 525  Create a Tool instance from a Haystack pipeline.
 526  
 527  **Parameters:**
 528  
 529  - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool.
 530  - **name** (<code>str</code>) – Name of the tool.
 531  - **description** (<code>str</code>) – Description of the tool.
 532  - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths.
 533    If not provided, a default input mapping will be created based on all pipeline inputs.
 534    Example:
 535  
 536  ```python
 537  input_mapping={
 538      "query": ["retriever.query", "prompt_builder.query"],
 539  }
 540  ```
 541  
 542  - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names.
 543    If not provided, a default output mapping will be created based on all pipeline outputs.
 544    Example:
 545  
 546  ```python
 547  output_mapping={
 548      "retriever.documents": "documents",
 549      "generator.replies": "replies",
 550  }
 551  ```
 552  
 553  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
 554    Will fall back to the parameters defined in the component's run method signature if not provided.
 555  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 556    If not provided, the tool result is converted to a string using a default handler.
 557  
 558  `outputs_to_string` supports two formats:
 559  
 560  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 561  
 562     ```python
 563     {
 564         "source": "docs", "handler": format_documents, "raw_result": False
 565     }
 566     ```
 567  
 568     - `source`: If provided, only the specified output key is sent to the handler.
 569     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 570       final result.
 571     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 572       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 573       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 574       ensure compatibility with Chat Generators.
 575  
 576  1. Multiple output format - map keys to individual configurations:
 577  
 578     ```python
 579     {
 580         "formatted_docs": {"source": "docs", "handler": format_documents},
 581         "summary": {"source": "summary_text", "handler": str.upper}
 582     }
 583     ```
 584  
 585     Each key maps to a dictionary that can contain "source" and/or "handler".
 586     Note that `raw_result` is not supported in the multiple output format.
 587  
 588  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 589    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 590  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 591    If the source is provided only the specified output key is sent to the handler.
 592    Example:
 593  
 594  ```python
 595  {
 596      "documents": {"source": "docs", "handler": custom_handler}
 597  }
 598  ```
 599  
 600  If the source is omitted the whole tool result is sent to the handler.
 601  Example:
 602  
 603  ```python
 604  {
 605      "documents": {"handler": custom_handler}
 606  }
 607  ```
 608  
 609  **Raises:**
 610  
 611  - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance.
 612  
 613  #### to_dict
 614  
 615  ```python
 616  to_dict() -> dict[str, Any]
 617  ```
 618  
 619  Serializes the PipelineTool to a dictionary.
 620  
 621  **Returns:**
 622  
 623  - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool.
 624  
 625  #### from_dict
 626  
 627  ```python
 628  from_dict(data: dict[str, Any]) -> PipelineTool
 629  ```
 630  
 631  Deserializes the PipelineTool from a dictionary.
 632  
 633  **Parameters:**
 634  
 635  - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool.
 636  
 637  **Returns:**
 638  
 639  - <code>PipelineTool</code> – The deserialized PipelineTool instance.
 640  
 641  ## searchable_toolset
 642  
 643  ### SearchableToolset
 644  
 645  Bases: <code>Toolset</code>
 646  
 647  Dynamic tool discovery from large catalogs using BM25 search.
 648  
 649  This Toolset enables LLMs to discover and use tools from large catalogs through
 650  BM25-based search. Instead of exposing all tools at once (which can overwhelm the
 651  LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to
 652  find and load specific tools as needed.
 653  
 654  For very small catalogs (below `search_threshold`), acts as a simple passthrough
 655  exposing all tools directly without any discovery mechanism.
 656  
 657  ### Usage Example
 658  
 659  ```python
 660  from haystack.components.agents import Agent
 661  from haystack.components.generators.chat import OpenAIChatGenerator
 662  from haystack.dataclasses import ChatMessage
 663  from haystack.tools import Tool, SearchableToolset
 664  
 665  # Create a catalog of tools
 666  catalog = [
 667      Tool(name="get_weather", description="Get weather for a city",
 668           parameters={}, function=lambda: None),
 669      Tool(name="search_web", description="Search the web",
 670           parameters={}, function=lambda: None),
 671      # ... 100s more tools
 672  ]
 673  toolset = SearchableToolset(catalog=catalog)
 674  
 675  agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset)
 676  
 677  # The agent is initially provided only with the search_tools tool and will use it to find relevant tools.
 678  result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")])
 679  ```
 680  
 681  #### __init__
 682  
 683  ```python
 684  __init__(
 685      catalog: ToolsType,
 686      *,
 687      top_k: int = 3,
 688      search_threshold: int = 8,
 689      search_tool_name: str = "search_tools",
 690      search_tool_description: str | None = None,
 691      search_tool_parameters_description: dict[str, str] | None = None
 692  ) -> None
 693  ```
 694  
 695  Initialize the SearchableToolset.
 696  
 697  **Parameters:**
 698  
 699  - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset.
 700  - **top_k** (<code>int</code>) – Default number of results for search_tools.
 701  - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search.
 702    If catalog has fewer tools, acts as passthrough (all tools visible).
 703    Default is 8.
 704  - **search_tool_name** (<code>str</code>) – Custom name for the bootstrap search tool. Default is "search_tools".
 705  - **search_tool_description** (<code>str | None</code>) – Custom description for the bootstrap search tool.
 706    If not provided, uses a default description.
 707  - **search_tool_parameters_description** (<code>dict\[str, str\] | None</code>) – Custom descriptions for the bootstrap search tool's parameters.
 708    Keys must be a subset of `{"tool_keywords", "k"}`.
 709    Example: `{"tool_keywords": "Keywords to find tools, e.g. 'email send'"}`
 710  
 711  #### add
 712  
 713  ```python
 714  add(tool: Tool | Toolset) -> None
 715  ```
 716  
 717  Adding new tools after initialization is not supported for SearchableToolset.
 718  
 719  #### warm_up
 720  
 721  ```python
 722  warm_up() -> None
 723  ```
 724  
 725  Prepare the toolset for use.
 726  
 727  Warms up child toolsets first (so lazy toolsets like MCPToolset can connect),
 728  then flattens the catalog, indexes it, and creates the search_tools bootstrap tool.
 729  In passthrough mode, it warms up all catalog tools directly.
 730  Must be called before using the toolset with an Agent.
 731  
 732  #### clear
 733  
 734  ```python
 735  clear() -> None
 736  ```
 737  
 738  Clear all discovered tools.
 739  
 740  This method allows resetting the toolset's discovered tools between agent runs
 741  when the same toolset instance is reused. This can be useful for long-running
 742  applications to control memory usage or to start fresh searches.
 743  
 744  #### to_dict
 745  
 746  ```python
 747  to_dict() -> dict[str, Any]
 748  ```
 749  
 750  Serialize the toolset to a dictionary.
 751  
 752  **Returns:**
 753  
 754  - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset.
 755  
 756  #### from_dict
 757  
 758  ```python
 759  from_dict(data: dict[str, Any]) -> SearchableToolset
 760  ```
 761  
 762  Deserialize a toolset from a dictionary.
 763  
 764  **Parameters:**
 765  
 766  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset.
 767  
 768  **Returns:**
 769  
 770  - <code>SearchableToolset</code> – New SearchableToolset instance.
 771  
 772  ## tool
 773  
 774  ### Tool
 775  
 776  Data class representing a Tool that Language Models can prepare a call for.
 777  
 778  Accurate definitions of the textual attributes such as `name` and `description`
 779  are important for the Language Model to correctly prepare the call.
 780  
 781  For resource-intensive operations like establishing connections to remote services or
 782  loading models, override the `warm_up()` method. This method is called before the Tool
 783  is used and should be idempotent, as it may be called multiple times during
 784  pipeline/agent setup.
 785  
 786  **Parameters:**
 787  
 788  - **name** (<code>str</code>) – Name of the Tool.
 789  - **description** (<code>str</code>) – Description of the Tool.
 790  - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool.
 791  - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called.
 792    Must be a synchronous function; async functions are not supported.
 793  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 794    If not provided, the tool result is converted to a string using a default handler.
 795  
 796  `outputs_to_string` supports two formats:
 797  
 798  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 799  
 800     ```python
 801     {
 802         "source": "docs", "handler": format_documents, "raw_result": False
 803     }
 804     ```
 805  
 806     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 807       tool result is sent to the handler.
 808     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 809       final result.
 810     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 811       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 812       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 813       Generators.
 814  
 815  1. Multiple output format - map keys to individual configurations:
 816  
 817     ```python
 818     {
 819         "formatted_docs": {"source": "docs", "handler": format_documents},
 820         "summary": {"source": "summary_text", "handler": str.upper}
 821     }
 822     ```
 823  
 824     Each key maps to a dictionary that can contain "source" and/or "handler".
 825     Note that `raw_result` is not supported in the multiple output format.
 826  
 827  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 828    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 829  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 830    If the source is provided only the specified output key is sent to the handler.
 831    Example:
 832  
 833  ```python
 834  {
 835      "documents": {"source": "docs", "handler": custom_handler}
 836  }
 837  ```
 838  
 839  If the source is omitted the whole tool result is sent to the handler.
 840  Example:
 841  
 842  ```python
 843  {
 844      "documents": {"handler": custom_handler}
 845  }
 846  ```
 847  
 848  **Raises:**
 849  
 850  - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the
 851    `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid.
 852  - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or
 853    `inputs_from_state` has the wrong type.
 854  
 855  #### tool_spec
 856  
 857  ```python
 858  tool_spec: dict[str, Any]
 859  ```
 860  
 861  Return the Tool specification to be used by the Language Model.
 862  
 863  #### warm_up
 864  
 865  ```python
 866  warm_up() -> None
 867  ```
 868  
 869  Prepare the Tool for use.
 870  
 871  Override this method to establish connections to remote services, load models,
 872  or perform other resource-intensive initialization. This method should be idempotent,
 873  as it may be called multiple times.
 874  
 875  #### invoke
 876  
 877  ```python
 878  invoke(**kwargs: Any) -> Any
 879  ```
 880  
 881  Invoke the Tool with the provided keyword arguments.
 882  
 883  #### to_dict
 884  
 885  ```python
 886  to_dict() -> dict[str, Any]
 887  ```
 888  
 889  Serializes the Tool to a dictionary.
 890  
 891  **Returns:**
 892  
 893  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 894  
 895  #### from_dict
 896  
 897  ```python
 898  from_dict(data: dict[str, Any]) -> Tool
 899  ```
 900  
 901  Deserializes the Tool from a dictionary.
 902  
 903  **Parameters:**
 904  
 905  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 906  
 907  **Returns:**
 908  
 909  - <code>Tool</code> – Deserialized Tool.
 910  
 911  ## toolset
 912  
 913  ### Toolset
 914  
 915  A collection of related Tools that can be used and managed as a cohesive unit.
 916  
 917  Toolset serves two main purposes:
 918  
 919  1. Group related tools together:
 920     Toolset allows you to organize related tools into a single collection, making it easier
 921     to manage and use them as a unit in Haystack pipelines.
 922  
 923     Example:
 924  
 925  ```python
 926  from haystack.tools import Tool, Toolset
 927  from haystack.components.tools import ToolInvoker
 928  
 929  # Define math functions
 930  def add_numbers(a: int, b: int) -> int:
 931      return a + b
 932  
 933  def subtract_numbers(a: int, b: int) -> int:
 934      return a - b
 935  
 936  # Create tools with proper schemas
 937  add_tool = Tool(
 938      name="add",
 939      description="Add two numbers",
 940      parameters={
 941          "type": "object",
 942          "properties": {
 943              "a": {"type": "integer"},
 944              "b": {"type": "integer"}
 945          },
 946          "required": ["a", "b"]
 947      },
 948      function=add_numbers
 949  )
 950  
 951  subtract_tool = Tool(
 952      name="subtract",
 953      description="Subtract b from a",
 954      parameters={
 955          "type": "object",
 956          "properties": {
 957              "a": {"type": "integer"},
 958              "b": {"type": "integer"}
 959          },
 960          "required": ["a", "b"]
 961      },
 962      function=subtract_numbers
 963  )
 964  
 965  # Create a toolset with the math tools
 966  math_toolset = Toolset([add_tool, subtract_tool])
 967  
 968  # Use the toolset with a ToolInvoker or ChatGenerator component
 969  invoker = ToolInvoker(tools=math_toolset)
 970  ```
 971  
 972  2. Base class for dynamic tool loading:
 973     By subclassing Toolset, you can create implementations that dynamically load tools
 974     from external sources like OpenAPI URLs, MCP servers, or other resources.
 975  
 976     Example:
 977  
 978  ```python
 979  from haystack.core.serialization import generate_qualified_class_name
 980  from haystack.tools import Tool, Toolset
 981  from haystack.components.tools import ToolInvoker
 982  
 983  class CalculatorToolset(Toolset):
 984      '''A toolset for calculator operations.'''
 985  
 986      def __init__(self) -> None:
 987          tools = self._create_tools()
 988          super().__init__(tools)
 989  
 990      def _create_tools(self):
 991          # These Tool instances are obviously defined statically and for illustration purposes only.
 992          # In a real-world scenario, you would dynamically load tools from an external source here.
 993          tools = []
 994          add_tool = Tool(
 995              name="add",
 996              description="Add two numbers",
 997              parameters={
 998                  "type": "object",
 999                  "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
1000                  "required": ["a", "b"],
1001              },
1002              function=lambda a, b: a + b,
1003          )
1004  
1005          multiply_tool = Tool(
1006              name="multiply",
1007              description="Multiply two numbers",
1008              parameters={
1009                  "type": "object",
1010                  "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
1011                  "required": ["a", "b"],
1012              },
1013              function=lambda a, b: a * b,
1014          )
1015  
1016          tools.append(add_tool)
1017          tools.append(multiply_tool)
1018  
1019          return tools
1020  
1021      def to_dict(self):
1022          return {
1023              "type": generate_qualified_class_name(type(self)),
1024              "data": {},  # no data to serialize as we define the tools dynamically
1025          }
1026  
1027      @classmethod
1028      def from_dict(cls, data):
1029          return cls()  # Recreate the tools dynamically during deserialization
1030  
1031  # Create the dynamic toolset and use it with ToolInvoker
1032  calculator_toolset = CalculatorToolset()
1033  invoker = ToolInvoker(tools=calculator_toolset)
1034  ```
1035  
1036  Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__),
1037  making it behave like a list of Tools. This makes it compatible with components that expect
1038  iterable tools, such as ToolInvoker or Haystack chat generators.
1039  
1040  When implementing a custom Toolset subclass for dynamic tool loading:
1041  
1042  - Perform the dynamic loading in the __init__ method
1043  - Override to_dict() and from_dict() methods if your tools are defined dynamically
1044  - Serialize endpoint descriptors rather than tool instances if your tools
1045    are loaded from external sources
1046  
1047  #### warm_up
1048  
1049  ```python
1050  warm_up() -> None
1051  ```
1052  
1053  Prepare the Toolset for use.
1054  
1055  By default, this method iterates through and warms up all tools in the Toolset.
1056  Subclasses can override this method to customize initialization behavior, such as:
1057  
1058  - Setting up shared resources (database connections, HTTP sessions) instead of
1059    warming individual tools
1060  - Implementing custom initialization logic for dynamically loaded tools
1061  - Controlling when and how tools are initialized
1062  
1063  For example, a Toolset that manages tools from an external service (like MCPToolset)
1064  might override this to initialize a shared connection rather than warming up
1065  individual tools:
1066  
1067  ```python
1068  class MCPToolset(Toolset):
1069      def warm_up(self) -> None:
1070          # Only warm up the shared MCP connection, not individual tools
1071          self.mcp_connection = establish_connection(self.server_url)
1072  ```
1073  
1074  This method should be idempotent, as it may be called multiple times.
1075  
1076  #### add
1077  
1078  ```python
1079  add(tool: Union[Tool, Toolset]) -> None
1080  ```
1081  
1082  Add a new Tool or merge another Toolset.
1083  
1084  **Parameters:**
1085  
1086  - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add
1087  
1088  **Raises:**
1089  
1090  - <code>ValueError</code> – If adding the tool would result in duplicate tool names
1091  - <code>TypeError</code> – If the provided object is not a Tool or Toolset
1092  
1093  #### to_dict
1094  
1095  ```python
1096  to_dict() -> dict[str, Any]
1097  ```
1098  
1099  Serialize the Toolset to a dictionary.
1100  
1101  **Returns:**
1102  
1103  - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset
1104  
1105  Note for subclass implementers:
1106  The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass
1107  of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or
1108  a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool
1109  instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead
1110  associated with serializing potentially large collections of Tool objects. Moreover, by serializing the
1111  descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even
1112  if they have been modified or removed since the last serialization. Failing to serialize the descriptor may
1113  lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or
1114  unexpected behavior.
1115  
1116  #### from_dict
1117  
1118  ```python
1119  from_dict(data: dict[str, Any]) -> Toolset
1120  ```
1121  
1122  Deserialize a Toolset from a dictionary.
1123  
1124  **Parameters:**
1125  
1126  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset
1127  
1128  **Returns:**
1129  
1130  - <code>Toolset</code> – A new Toolset instance