Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.25 / haystack-api / tools_api.md
tools_api.md
   1  ---
   2  title: "Tools"
   3  id: tools-api
   4  description: "Unified abstractions to represent tools across the framework."
   5  slug: "/tools-api"
   6  ---
   7  
   8  
   9  ## component_tool
  10  
  11  ### ComponentTool
  12  
  13  Bases: <code>Tool</code>
  14  
  15  A Tool that wraps Haystack components, allowing them to be used as tools by LLMs.
  16  
  17  ComponentTool automatically generates LLM-compatible tool schemas from component input sockets,
  18  which are derived from the component's `run` method signature and type hints.
  19  
  20  Key features:
  21  
  22  - Automatic LLM tool calling schema generation from component input sockets
  23  - Type conversion and validation for component inputs
  24  - Support for types:
  25    - Dataclasses
  26    - Lists of dataclasses
  27    - Basic types (str, int, float, bool, dict)
  28    - Lists of basic types
  29  - Automatic name generation from component class name
  30  - Description extraction from component docstrings
  31  
  32  To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create.
  33  You can create a ComponentTool from the component by passing the component to the ComponentTool constructor.
  34  Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component.
  35  
  36  ## Usage Example:
  37  
  38  ```python
  39  from haystack import component, Pipeline
  40  from haystack.tools import ComponentTool
  41  from haystack.components.websearch import SerperDevWebSearch
  42  from haystack.utils import Secret
  43  from haystack.components.tools.tool_invoker import ToolInvoker
  44  from haystack.components.generators.chat import OpenAIChatGenerator
  45  from haystack.dataclasses import ChatMessage
  46  
  47  # Create a SerperDev search component
  48  search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3)
  49  
  50  # Create a tool from the component
  51  tool = ComponentTool(
  52      component=search,
  53      name="web_search",  # Optional: defaults to "serper_dev_web_search"
  54      description="Search the web for current information on any topic"  # Optional: defaults to component docstring
  55  )
  56  
  57  # Create pipeline with OpenAIChatGenerator and ToolInvoker
  58  pipeline = Pipeline()
  59  pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool]))
  60  pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool]))
  61  
  62  # Connect components
  63  pipeline.connect("llm.replies", "tool_invoker.messages")
  64  
  65  message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla")
  66  
  67  # Run pipeline
  68  result = pipeline.run({"llm": {"messages": [message]}})
  69  
  70  print(result)
  71  ```
  72  
  73  #### __init__
  74  
  75  ```python
  76  __init__(
  77      component: Component,
  78      name: str | None = None,
  79      description: str | None = None,
  80      parameters: dict[str, Any] | None = None,
  81      *,
  82      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
  83      inputs_from_state: dict[str, str] | None = None,
  84      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
  85  ) -> None
  86  ```
  87  
  88  Create a Tool instance from a Haystack component.
  89  
  90  **Parameters:**
  91  
  92  - **component** (<code>Component</code>) – The Haystack component to wrap as a tool.
  93  - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name).
  94  - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring).
  95  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
  96    Will fall back to the parameters defined in the component's run method signature if not provided.
  97  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
  98    If not provided, the tool result is converted to a string using a default handler.
  99  
 100  `outputs_to_string` supports two formats:
 101  
 102  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 103  
 104     ```python
 105     {
 106         "source": "docs", "handler": format_documents, "raw_result": False
 107     }
 108     ```
 109  
 110     - `source`: If provided, only the specified output key is sent to the handler.
 111     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 112       final result.
 113     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 114       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 115       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 116       ensure compatibility with Chat Generators.
 117  
 118  1. Multiple output format - map keys to individual configurations:
 119  
 120     ```python
 121     {
 122         "formatted_docs": {"source": "docs", "handler": format_documents},
 123         "summary": {"source": "summary_text", "handler": str.upper}
 124     }
 125     ```
 126  
 127     Each key maps to a dictionary that can contain "source" and/or "handler".
 128     Note that `raw_result` is not supported in the multiple output format.
 129  
 130  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 131    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 132  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 133    If the source is provided only the specified output key is sent to the handler.
 134    Example:
 135  
 136  ```python
 137  {
 138      "documents": {"source": "docs", "handler": custom_handler}
 139  }
 140  ```
 141  
 142  If the source is omitted the whole tool result is sent to the handler.
 143  Example:
 144  
 145  ```python
 146  {
 147      "documents": {"handler": custom_handler}
 148  }
 149  ```
 150  
 151  **Raises:**
 152  
 153  - <code>TypeError</code> – If the object passed is not a Haystack Component instance.
 154  - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails.
 155  
 156  #### warm_up
 157  
 158  ```python
 159  warm_up()
 160  ```
 161  
 162  Prepare the ComponentTool for use.
 163  
 164  #### to_dict
 165  
 166  ```python
 167  to_dict() -> dict[str, Any]
 168  ```
 169  
 170  Serializes the ComponentTool to a dictionary.
 171  
 172  #### from_dict
 173  
 174  ```python
 175  from_dict(data: dict[str, Any]) -> ComponentTool
 176  ```
 177  
 178  Deserializes the ComponentTool from a dictionary.
 179  
 180  ## from_function
 181  
 182  ### create_tool_from_function
 183  
 184  ```python
 185  create_tool_from_function(
 186      function: Callable,
 187      name: str | None = None,
 188      description: str | None = None,
 189      inputs_from_state: dict[str, str] | None = None,
 190      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 191      outputs_to_string: dict[str, Any] | None = None,
 192  ) -> Tool
 193  ```
 194  
 195  Create a Tool instance from a function.
 196  
 197  Allows customizing the Tool name and description.
 198  For simpler use cases, consider using the `@tool` decorator.
 199  
 200  ### Usage example
 201  
 202  ```python
 203  from typing import Annotated, Literal
 204  from haystack.tools import create_tool_from_function
 205  
 206  def get_weather(
 207      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 208      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 209      '''A simple function to get the current weather for a location.'''
 210      return f"Weather report for {city}: 20 {unit}, sunny"
 211  
 212  tool = create_tool_from_function(get_weather)
 213  
 214  print(tool)
 215  >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 216  >>> parameters={
 217  >>> 'type': 'object',
 218  >>> 'properties': {
 219  >>>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 220  >>>     'unit': {
 221  >>>         'type': 'string',
 222  >>>         'enum': ['Celsius', 'Fahrenheit'],
 223  >>>         'description': 'the unit for the temperature',
 224  >>>         'default': 'Celsius',
 225  >>>     },
 226  >>>     }
 227  >>> },
 228  >>> function=<function get_weather at 0x7f7b3a8a9b80>)
 229  ```
 230  
 231  **Parameters:**
 232  
 233  - **function** (<code>Callable</code>) – The function to be converted into a Tool.
 234    The function must include type hints for all parameters.
 235    The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple).
 236    Other input types may work but are not guaranteed.
 237    If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description.
 238  - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used.
 239  - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used.
 240    To intentionally leave the description empty, pass an empty string.
 241  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 242    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 243  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 244    If the source is provided only the specified output key is sent to the handler.
 245    Example:
 246  
 247  ```python
 248  {
 249      "documents": {"source": "docs", "handler": custom_handler}
 250  }
 251  ```
 252  
 253  If the source is omitted the whole tool result is sent to the handler.
 254  Example:
 255  
 256  ```python
 257  {
 258      "documents": {"handler": custom_handler}
 259  }
 260  ```
 261  
 262  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 263    If not provided, the tool result is converted to a string using a default handler.
 264  
 265  `outputs_to_string` supports two formats:
 266  
 267  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 268  
 269     ```python
 270     {
 271         "source": "docs", "handler": format_documents, "raw_result": False
 272     }
 273     ```
 274  
 275     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 276       tool result is sent to the handler.
 277     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 278       final result.
 279     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 280       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 281       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 282       Generators.
 283  
 284  1. Multiple output format - map keys to individual configurations:
 285  
 286     ```python
 287     {
 288         "formatted_docs": {"source": "docs", "handler": format_documents},
 289         "summary": {"source": "summary_text", "handler": str.upper}
 290     }
 291     ```
 292  
 293     Each key maps to a dictionary that can contain "source" and/or "handler".
 294     Note that `raw_result` is not supported in the multiple output format.
 295  
 296  **Returns:**
 297  
 298  - <code>Tool</code> – The Tool created from the function.
 299  
 300  **Raises:**
 301  
 302  - <code>ValueError</code> – If any parameter of the function lacks a type hint.
 303  - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool.
 304  
 305  ### tool
 306  
 307  ```python
 308  tool(
 309      function: Callable | None = None,
 310      *,
 311      name: str | None = None,
 312      description: str | None = None,
 313      inputs_from_state: dict[str, str] | None = None,
 314      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 315      outputs_to_string: dict[str, Any] | None = None
 316  ) -> Tool | Callable[[Callable], Tool]
 317  ```
 318  
 319  Decorator to convert a function into a Tool.
 320  
 321  Can be used with or without parameters:
 322  @tool # without parameters
 323  def my_function(): ...
 324  
 325  @tool(name="custom_name") # with parameters
 326  def my_function(): ...
 327  
 328  ### Usage example
 329  
 330  ```python
 331  from typing import Annotated, Literal
 332  from haystack.tools import tool
 333  
 334  @tool
 335  def get_weather(
 336      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 337      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 338      '''A simple function to get the current weather for a location.'''
 339      return f"Weather report for {city}: 20 {unit}, sunny"
 340  
 341  print(get_weather)
 342  >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 343  >>> parameters={
 344  >>> 'type': 'object',
 345  >>> 'properties': {
 346  >>>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 347  >>>     'unit': {
 348  >>>         'type': 'string',
 349  >>>         'enum': ['Celsius', 'Fahrenheit'],
 350  >>>         'description': 'the unit for the temperature',
 351  >>>         'default': 'Celsius',
 352  >>>     },
 353  >>>     }
 354  >>> },
 355  >>> function=<function get_weather at 0x7f7b3a8a9b80>)
 356  ```
 357  
 358  **Parameters:**
 359  
 360  - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters)
 361  - **name** (<code>str | None</code>) – Optional custom name for the tool
 362  - **description** (<code>str | None</code>) – Optional custom description
 363  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 364    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 365  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 366    If the source is provided only the specified output key is sent to the handler.
 367    Example:
 368  
 369  ```python
 370  {
 371      "documents": {"source": "docs", "handler": custom_handler}
 372  }
 373  ```
 374  
 375  If the source is omitted the whole tool result is sent to the handler.
 376  Example:
 377  
 378  ```python
 379  {
 380      "documents": {"handler": custom_handler}
 381  }
 382  ```
 383  
 384  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 385    If not provided, the tool result is converted to a string using a default handler.
 386  
 387  `outputs_to_string` supports two formats:
 388  
 389  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 390  
 391     ```python
 392     {
 393         "source": "docs", "handler": format_documents, "raw_result": False
 394     }
 395     ```
 396  
 397     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 398       tool result is sent to the handler.
 399     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 400       final result.
 401     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 402       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 403       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 404       Generators.
 405  
 406  1. Multiple output format - map keys to individual configurations:
 407  
 408     ```python
 409     {
 410         "formatted_docs": {"source": "docs", "handler": format_documents},
 411         "summary": {"source": "summary_text", "handler": str.upper}
 412     }
 413     ```
 414  
 415     Each key maps to a dictionary that can contain "source" and/or "handler".
 416     Note that `raw_result` is not supported in the multiple output format.
 417  
 418  **Returns:**
 419  
 420  - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one
 421  
 422  ## pipeline_tool
 423  
 424  ### PipelineTool
 425  
 426  Bases: <code>ComponentTool</code>
 427  
 428  A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs.
 429  
 430  PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets,
 431  which are derived from the underlying components in the pipeline.
 432  
 433  Key features:
 434  
 435  - Automatic LLM tool calling schema generation from pipeline inputs
 436  - Description extraction of pipeline inputs based on the underlying component docstrings
 437  
 438  To use PipelineTool, you first need a Haystack pipeline.
 439  Below is an example of creating a PipelineTool
 440  
 441  ## Usage Example:
 442  
 443  ```python
 444  from haystack import Document, Pipeline
 445  from haystack.dataclasses import ChatMessage
 446  from haystack.document_stores.in_memory import InMemoryDocumentStore
 447  from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder
 448  from haystack.components.embedders.sentence_transformers_document_embedder import (
 449      SentenceTransformersDocumentEmbedder
 450  )
 451  from haystack.components.generators.chat import OpenAIChatGenerator
 452  from haystack.components.retrievers import InMemoryEmbeddingRetriever
 453  from haystack.components.agents import Agent
 454  from haystack.tools import PipelineTool
 455  
 456  # Initialize a document store and add some documents
 457  document_store = InMemoryDocumentStore()
 458  document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 459  documents = [
 460      Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."),
 461      Document(
 462          content="He is best known for his contributions to the design of the modern alternating current (AC) "
 463                  "electricity supply system."
 464      ),
 465  ]
 466  docs_with_embeddings = document_embedder.run(documents=documents)["documents"]
 467  document_store.write_documents(docs_with_embeddings)
 468  
 469  # Build a simple retrieval pipeline
 470  retrieval_pipeline = Pipeline()
 471  retrieval_pipeline.add_component(
 472      "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 473  )
 474  retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
 475  
 476  retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding")
 477  
 478  # Wrap the pipeline as a tool
 479  retriever_tool = PipelineTool(
 480      pipeline=retrieval_pipeline,
 481      input_mapping={"query": ["embedder.text"]},
 482      output_mapping={"retriever.documents": "documents"},
 483      name="document_retriever",
 484      description="For any questions about Nikola Tesla, always use this tool",
 485  )
 486  
 487  # Create an Agent with the tool
 488  agent = Agent(
 489      chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
 490      tools=[retriever_tool]
 491  )
 492  
 493  # Let the Agent handle a query
 494  result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")])
 495  
 496  # Print result of the tool call
 497  print("Tool Call Result:")
 498  print(result["messages"][2].tool_call_result.result)
 499  print("")
 500  
 501  # Print answer
 502  print("Answer:")
 503  print(result["messages"][-1].text)
 504  ```
 505  
 506  #### __init__
 507  
 508  ```python
 509  __init__(
 510      pipeline: Pipeline | AsyncPipeline,
 511      *,
 512      name: str,
 513      description: str,
 514      input_mapping: dict[str, list[str]] | None = None,
 515      output_mapping: dict[str, str] | None = None,
 516      parameters: dict[str, Any] | None = None,
 517      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
 518      inputs_from_state: dict[str, str] | None = None,
 519      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
 520  ) -> None
 521  ```
 522  
 523  Create a Tool instance from a Haystack pipeline.
 524  
 525  **Parameters:**
 526  
 527  - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool.
 528  - **name** (<code>str</code>) – Name of the tool.
 529  - **description** (<code>str</code>) – Description of the tool.
 530  - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths.
 531    If not provided, a default input mapping will be created based on all pipeline inputs.
 532    Example:
 533  
 534  ```python
 535  input_mapping={
 536      "query": ["retriever.query", "prompt_builder.query"],
 537  }
 538  ```
 539  
 540  - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names.
 541    If not provided, a default output mapping will be created based on all pipeline outputs.
 542    Example:
 543  
 544  ```python
 545  output_mapping={
 546      "retriever.documents": "documents",
 547      "generator.replies": "replies",
 548  }
 549  ```
 550  
 551  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
 552    Will fall back to the parameters defined in the component's run method signature if not provided.
 553  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 554    If not provided, the tool result is converted to a string using a default handler.
 555  
 556  `outputs_to_string` supports two formats:
 557  
 558  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 559  
 560     ```python
 561     {
 562         "source": "docs", "handler": format_documents, "raw_result": False
 563     }
 564     ```
 565  
 566     - `source`: If provided, only the specified output key is sent to the handler.
 567     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 568       final result.
 569     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 570       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 571       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 572       ensure compatibility with Chat Generators.
 573  
 574  1. Multiple output format - map keys to individual configurations:
 575  
 576     ```python
 577     {
 578         "formatted_docs": {"source": "docs", "handler": format_documents},
 579         "summary": {"source": "summary_text", "handler": str.upper}
 580     }
 581     ```
 582  
 583     Each key maps to a dictionary that can contain "source" and/or "handler".
 584     Note that `raw_result` is not supported in the multiple output format.
 585  
 586  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 587    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 588  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 589    If the source is provided only the specified output key is sent to the handler.
 590    Example:
 591  
 592  ```python
 593  {
 594      "documents": {"source": "docs", "handler": custom_handler}
 595  }
 596  ```
 597  
 598  If the source is omitted the whole tool result is sent to the handler.
 599  Example:
 600  
 601  ```python
 602  {
 603      "documents": {"handler": custom_handler}
 604  }
 605  ```
 606  
 607  **Raises:**
 608  
 609  - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance.
 610  
 611  #### to_dict
 612  
 613  ```python
 614  to_dict() -> dict[str, Any]
 615  ```
 616  
 617  Serializes the PipelineTool to a dictionary.
 618  
 619  **Returns:**
 620  
 621  - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool.
 622  
 623  #### from_dict
 624  
 625  ```python
 626  from_dict(data: dict[str, Any]) -> PipelineTool
 627  ```
 628  
 629  Deserializes the PipelineTool from a dictionary.
 630  
 631  **Parameters:**
 632  
 633  - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool.
 634  
 635  **Returns:**
 636  
 637  - <code>PipelineTool</code> – The deserialized PipelineTool instance.
 638  
 639  ## searchable_toolset
 640  
 641  ### SearchableToolset
 642  
 643  Bases: <code>Toolset</code>
 644  
 645  Dynamic tool discovery from large catalogs using BM25 search.
 646  
 647  This Toolset enables LLMs to discover and use tools from large catalogs through
 648  BM25-based search. Instead of exposing all tools at once (which can overwhelm the
 649  LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to
 650  find and load specific tools as needed.
 651  
 652  For very small catalogs (below `search_threshold`), acts as a simple passthrough
 653  exposing all tools directly without any discovery mechanism.
 654  
 655  ### Usage Example
 656  
 657  ```python
 658  from haystack.components.agents import Agent
 659  from haystack.components.generators.chat import OpenAIChatGenerator
 660  from haystack.dataclasses import ChatMessage
 661  from haystack.tools import Tool, SearchableToolset
 662  
 663  # Create a catalog of tools
 664  catalog = [
 665      Tool(name="get_weather", description="Get weather for a city", ...),
 666      Tool(name="search_web", description="Search the web", ...),
 667      # ... 100s more tools
 668  ]
 669  toolset = SearchableToolset(catalog=catalog)
 670  
 671  agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset)
 672  
 673  # The agent is initially provided only with the search_tools tool and will use it to find relevant tools.
 674  result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")])
 675  ```
 676  
 677  #### __init__
 678  
 679  ```python
 680  __init__(catalog: ToolsType, *, top_k: int = 3, search_threshold: int = 8)
 681  ```
 682  
 683  Initialize the SearchableToolset.
 684  
 685  **Parameters:**
 686  
 687  - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset.
 688  - **top_k** (<code>int</code>) – Default number of results for search_tools.
 689  - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search.
 690    If catalog has fewer tools, acts as passthrough (all tools visible).
 691    Default is 8.
 692  
 693  #### add
 694  
 695  ```python
 696  add(tool: Tool | Toolset) -> None
 697  ```
 698  
 699  Adding new tools after initialization is not supported for SearchableToolset.
 700  
 701  #### warm_up
 702  
 703  ```python
 704  warm_up() -> None
 705  ```
 706  
 707  Prepare the toolset for use.
 708  
 709  Warms up child toolsets first (so lazy toolsets like MCPToolset can connect),
 710  then flattens the catalog, indexes it, and creates the search_tools bootstrap tool.
 711  In passthrough mode, it warms up all catalog tools directly.
 712  Must be called before using the toolset with an Agent.
 713  
 714  #### clear
 715  
 716  ```python
 717  clear() -> None
 718  ```
 719  
 720  Clear all discovered tools.
 721  
 722  This method allows resetting the toolset's discovered tools between agent runs
 723  when the same toolset instance is reused. This can be useful for long-running
 724  applications to control memory usage or to start fresh searches.
 725  
 726  #### to_dict
 727  
 728  ```python
 729  to_dict() -> dict[str, Any]
 730  ```
 731  
 732  Serialize the toolset to a dictionary.
 733  
 734  **Returns:**
 735  
 736  - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset.
 737  
 738  #### from_dict
 739  
 740  ```python
 741  from_dict(data: dict[str, Any]) -> SearchableToolset
 742  ```
 743  
 744  Deserialize a toolset from a dictionary.
 745  
 746  **Parameters:**
 747  
 748  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset.
 749  
 750  **Returns:**
 751  
 752  - <code>SearchableToolset</code> – New SearchableToolset instance.
 753  
 754  ## tool
 755  
 756  ### Tool
 757  
 758  Data class representing a Tool that Language Models can prepare a call for.
 759  
 760  Accurate definitions of the textual attributes such as `name` and `description`
 761  are important for the Language Model to correctly prepare the call.
 762  
 763  For resource-intensive operations like establishing connections to remote services or
 764  loading models, override the `warm_up()` method. This method is called before the Tool
 765  is used and should be idempotent, as it may be called multiple times during
 766  pipeline/agent setup.
 767  
 768  **Parameters:**
 769  
 770  - **name** (<code>str</code>) – Name of the Tool.
 771  - **description** (<code>str</code>) – Description of the Tool.
 772  - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool.
 773  - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called.
 774    Must be a synchronous function; async functions are not supported.
 775  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 776    If not provided, the tool result is converted to a string using a default handler.
 777  
 778  `outputs_to_string` supports two formats:
 779  
 780  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 781  
 782     ```python
 783     {
 784         "source": "docs", "handler": format_documents, "raw_result": False
 785     }
 786     ```
 787  
 788     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 789       tool result is sent to the handler.
 790     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 791       final result.
 792     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 793       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 794       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 795       Generators.
 796  
 797  1. Multiple output format - map keys to individual configurations:
 798  
 799     ```python
 800     {
 801         "formatted_docs": {"source": "docs", "handler": format_documents},
 802         "summary": {"source": "summary_text", "handler": str.upper}
 803     }
 804     ```
 805  
 806     Each key maps to a dictionary that can contain "source" and/or "handler".
 807     Note that `raw_result` is not supported in the multiple output format.
 808  
 809  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 810    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 811  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 812    If the source is provided only the specified output key is sent to the handler.
 813    Example:
 814  
 815  ```python
 816  {
 817      "documents": {"source": "docs", "handler": custom_handler}
 818  }
 819  ```
 820  
 821  If the source is omitted the whole tool result is sent to the handler.
 822  Example:
 823  
 824  ```python
 825  {
 826      "documents": {"handler": custom_handler}
 827  }
 828  ```
 829  
 830  **Raises:**
 831  
 832  - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the
 833    `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid.
 834  - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or
 835    `inputs_from_state` has the wrong type.
 836  
 837  #### tool_spec
 838  
 839  ```python
 840  tool_spec: dict[str, Any]
 841  ```
 842  
 843  Return the Tool specification to be used by the Language Model.
 844  
 845  #### warm_up
 846  
 847  ```python
 848  warm_up() -> None
 849  ```
 850  
 851  Prepare the Tool for use.
 852  
 853  Override this method to establish connections to remote services, load models,
 854  or perform other resource-intensive initialization. This method should be idempotent,
 855  as it may be called multiple times.
 856  
 857  #### invoke
 858  
 859  ```python
 860  invoke(**kwargs: Any) -> Any
 861  ```
 862  
 863  Invoke the Tool with the provided keyword arguments.
 864  
 865  #### to_dict
 866  
 867  ```python
 868  to_dict() -> dict[str, Any]
 869  ```
 870  
 871  Serializes the Tool to a dictionary.
 872  
 873  **Returns:**
 874  
 875  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 876  
 877  #### from_dict
 878  
 879  ```python
 880  from_dict(data: dict[str, Any]) -> Tool
 881  ```
 882  
 883  Deserializes the Tool from a dictionary.
 884  
 885  **Parameters:**
 886  
 887  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 888  
 889  **Returns:**
 890  
 891  - <code>Tool</code> – Deserialized Tool.
 892  
 893  ## toolset
 894  
 895  ### Toolset
 896  
 897  A collection of related Tools that can be used and managed as a cohesive unit.
 898  
 899  Toolset serves two main purposes:
 900  
 901  1. Group related tools together:
 902     Toolset allows you to organize related tools into a single collection, making it easier
 903     to manage and use them as a unit in Haystack pipelines.
 904  
 905     Example:
 906  
 907     ```python
 908     from haystack.tools import Tool, Toolset
 909     from haystack.components.tools import ToolInvoker
 910  
 911     # Define math functions
 912     def add_numbers(a: int, b: int) -> int:
 913         return a + b
 914  
 915     def subtract_numbers(a: int, b: int) -> int:
 916         return a - b
 917  
 918     # Create tools with proper schemas
 919     add_tool = Tool(
 920         name="add",
 921         description="Add two numbers",
 922         parameters={
 923             "type": "object",
 924             "properties": {
 925                 "a": {"type": "integer"},
 926                 "b": {"type": "integer"}
 927             },
 928             "required": ["a", "b"]
 929         },
 930         function=add_numbers
 931     )
 932  
 933     subtract_tool = Tool(
 934         name="subtract",
 935         description="Subtract b from a",
 936         parameters={
 937             "type": "object",
 938             "properties": {
 939                 "a": {"type": "integer"},
 940                 "b": {"type": "integer"}
 941             },
 942             "required": ["a", "b"]
 943         },
 944         function=subtract_numbers
 945     )
 946  
 947     # Create a toolset with the math tools
 948     math_toolset = Toolset([add_tool, subtract_tool])
 949  
 950     # Use the toolset with a ToolInvoker or ChatGenerator component
 951     invoker = ToolInvoker(tools=math_toolset)
 952     ```
 953  
 954  1. Base class for dynamic tool loading:
 955     By subclassing Toolset, you can create implementations that dynamically load tools
 956     from external sources like OpenAPI URLs, MCP servers, or other resources.
 957  
 958     Example:
 959  
 960     ```python
 961     from haystack.core.serialization import generate_qualified_class_name
 962     from haystack.tools import Tool, Toolset
 963     from haystack.components.tools import ToolInvoker
 964  
 965     class CalculatorToolset(Toolset):
 966         '''A toolset for calculator operations.'''
 967  
 968         def __init__(self):
 969             tools = self._create_tools()
 970             super().__init__(tools)
 971  
 972         def _create_tools(self):
 973             # These Tool instances are obviously defined statically and for illustration purposes only.
 974             # In a real-world scenario, you would dynamically load tools from an external source here.
 975             tools = []
 976             add_tool = Tool(
 977                 name="add",
 978                 description="Add two numbers",
 979                 parameters={
 980                     "type": "object",
 981                     "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
 982                     "required": ["a", "b"],
 983                 },
 984                 function=lambda a, b: a + b,
 985             )
 986  
 987             multiply_tool = Tool(
 988                 name="multiply",
 989                 description="Multiply two numbers",
 990                 parameters={
 991                     "type": "object",
 992                     "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
 993                     "required": ["a", "b"],
 994                 },
 995                 function=lambda a, b: a * b,
 996             )
 997  
 998             tools.append(add_tool)
 999             tools.append(multiply_tool)
1000  
1001             return tools
1002  
1003         def to_dict(self):
1004             return {
1005                 "type": generate_qualified_class_name(type(self)),
1006                 "data": {},  # no data to serialize as we define the tools dynamically
1007             }
1008  
1009         @classmethod
1010         def from_dict(cls, data):
1011             return cls()  # Recreate the tools dynamically during deserialization
1012  
1013     # Create the dynamic toolset and use it with ToolInvoker
1014     calculator_toolset = CalculatorToolset()
1015     invoker = ToolInvoker(tools=calculator_toolset)
1016     ```
1017  
1018  Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__),
1019  making it behave like a list of Tools. This makes it compatible with components that expect
1020  iterable tools, such as ToolInvoker or Haystack chat generators.
1021  
1022  When implementing a custom Toolset subclass for dynamic tool loading:
1023  
1024  - Perform the dynamic loading in the __init__ method
1025  - Override to_dict() and from_dict() methods if your tools are defined dynamically
1026  - Serialize endpoint descriptors rather than tool instances if your tools
1027    are loaded from external sources
1028  
1029  #### warm_up
1030  
1031  ```python
1032  warm_up() -> None
1033  ```
1034  
1035  Prepare the Toolset for use.
1036  
1037  By default, this method iterates through and warms up all tools in the Toolset.
1038  Subclasses can override this method to customize initialization behavior, such as:
1039  
1040  - Setting up shared resources (database connections, HTTP sessions) instead of
1041    warming individual tools
1042  - Implementing custom initialization logic for dynamically loaded tools
1043  - Controlling when and how tools are initialized
1044  
1045  For example, a Toolset that manages tools from an external service (like MCPToolset)
1046  might override this to initialize a shared connection rather than warming up
1047  individual tools:
1048  
1049  ```python
1050  class MCPToolset(Toolset):
1051      def warm_up(self) -> None:
1052          # Only warm up the shared MCP connection, not individual tools
1053          self.mcp_connection = establish_connection(self.server_url)
1054  ```
1055  
1056  This method should be idempotent, as it may be called multiple times.
1057  
1058  #### add
1059  
1060  ```python
1061  add(tool: Union[Tool, Toolset]) -> None
1062  ```
1063  
1064  Add a new Tool or merge another Toolset.
1065  
1066  **Parameters:**
1067  
1068  - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add
1069  
1070  **Raises:**
1071  
1072  - <code>ValueError</code> – If adding the tool would result in duplicate tool names
1073  - <code>TypeError</code> – If the provided object is not a Tool or Toolset
1074  
1075  #### to_dict
1076  
1077  ```python
1078  to_dict() -> dict[str, Any]
1079  ```
1080  
1081  Serialize the Toolset to a dictionary.
1082  
1083  **Returns:**
1084  
1085  - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset
1086  
1087  Note for subclass implementers:
1088  The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass
1089  of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or
1090  a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool
1091  instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead
1092  associated with serializing potentially large collections of Tool objects. Moreover, by serializing the
1093  descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even
1094  if they have been modified or removed since the last serialization. Failing to serialize the descriptor may
1095  lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or
1096  unexpected behavior.
1097  
1098  #### from_dict
1099  
1100  ```python
1101  from_dict(data: dict[str, Any]) -> Toolset
1102  ```
1103  
1104  Deserialize a Toolset from a dictionary.
1105  
1106  **Parameters:**
1107  
1108  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset
1109  
1110  **Returns:**
1111  
1112  - <code>Toolset</code> – A new Toolset instance