Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.26 / haystack-api / tools_api.md
tools_api.md
   1  ---
   2  title: "Tools"
   3  id: tools-api
   4  description: "Unified abstractions to represent tools across the framework."
   5  slug: "/tools-api"
   6  ---
   7  
   8  
   9  ## component_tool
  10  
  11  ### ComponentTool
  12  
  13  Bases: <code>Tool</code>
  14  
  15  A Tool that wraps Haystack components, allowing them to be used as tools by LLMs.
  16  
  17  ComponentTool automatically generates LLM-compatible tool schemas from component input sockets,
  18  which are derived from the component's `run` method signature and type hints.
  19  
  20  Key features:
  21  
  22  - Automatic LLM tool calling schema generation from component input sockets
  23  - Type conversion and validation for component inputs
  24  - Support for types:
  25    - Dataclasses
  26    - Lists of dataclasses
  27    - Basic types (str, int, float, bool, dict)
  28    - Lists of basic types
  29  - Automatic name generation from component class name
  30  - Description extraction from component docstrings
  31  
  32  To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create.
  33  You can create a ComponentTool from the component by passing the component to the ComponentTool constructor.
  34  Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component.
  35  
  36  ## Usage Example:
  37  
  38  ```python
  39  from haystack import component, Pipeline
  40  from haystack.tools import ComponentTool
  41  from haystack.components.websearch import SerperDevWebSearch
  42  from haystack.utils import Secret
  43  from haystack.components.tools.tool_invoker import ToolInvoker
  44  from haystack.components.generators.chat import OpenAIChatGenerator
  45  from haystack.dataclasses import ChatMessage
  46  
  47  # Create a SerperDev search component
  48  search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3)
  49  
  50  # Create a tool from the component
  51  tool = ComponentTool(
  52      component=search,
  53      name="web_search",  # Optional: defaults to "serper_dev_web_search"
  54      description="Search the web for current information on any topic"  # Optional: defaults to component docstring
  55  )
  56  
  57  # Create pipeline with OpenAIChatGenerator and ToolInvoker
  58  pipeline = Pipeline()
  59  pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool]))
  60  pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool]))
  61  
  62  # Connect components
  63  pipeline.connect("llm.replies", "tool_invoker.messages")
  64  
  65  message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla")
  66  
  67  # Run pipeline
  68  result = pipeline.run({"llm": {"messages": [message]}})
  69  
  70  print(result)
  71  ```
  72  
  73  #### __init__
  74  
  75  ```python
  76  __init__(
  77      component: Component,
  78      name: str | None = None,
  79      description: str | None = None,
  80      parameters: dict[str, Any] | None = None,
  81      *,
  82      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
  83      inputs_from_state: dict[str, str] | None = None,
  84      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
  85  ) -> None
  86  ```
  87  
  88  Create a Tool instance from a Haystack component.
  89  
  90  **Parameters:**
  91  
  92  - **component** (<code>Component</code>) – The Haystack component to wrap as a tool.
  93  - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name).
  94  - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring).
  95  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
  96    Will fall back to the parameters defined in the component's run method signature if not provided.
  97  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
  98    If not provided, the tool result is converted to a string using a default handler.
  99  
 100  `outputs_to_string` supports two formats:
 101  
 102  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 103  
 104     ```python
 105     {
 106         "source": "docs", "handler": format_documents, "raw_result": False
 107     }
 108     ```
 109  
 110     - `source`: If provided, only the specified output key is sent to the handler.
 111     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 112       final result.
 113     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 114       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 115       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 116       ensure compatibility with Chat Generators.
 117  
 118  1. Multiple output format - map keys to individual configurations:
 119  
 120     ```python
 121     {
 122         "formatted_docs": {"source": "docs", "handler": format_documents},
 123         "summary": {"source": "summary_text", "handler": str.upper}
 124     }
 125     ```
 126  
 127     Each key maps to a dictionary that can contain "source" and/or "handler".
 128     Note that `raw_result` is not supported in the multiple output format.
 129  
 130  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 131    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 132  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 133    If the source is provided only the specified output key is sent to the handler.
 134    Example:
 135  
 136  ```python
 137  {
 138      "documents": {"source": "docs", "handler": custom_handler}
 139  }
 140  ```
 141  
 142  If the source is omitted the whole tool result is sent to the handler.
 143  Example:
 144  
 145  ```python
 146  {
 147      "documents": {"handler": custom_handler}
 148  }
 149  ```
 150  
 151  **Raises:**
 152  
 153  - <code>TypeError</code> – If the object passed is not a Haystack Component instance.
 154  - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails.
 155  
 156  #### warm_up
 157  
 158  ```python
 159  warm_up()
 160  ```
 161  
 162  Prepare the ComponentTool for use.
 163  
 164  #### to_dict
 165  
 166  ```python
 167  to_dict() -> dict[str, Any]
 168  ```
 169  
 170  Serializes the ComponentTool to a dictionary.
 171  
 172  #### from_dict
 173  
 174  ```python
 175  from_dict(data: dict[str, Any]) -> ComponentTool
 176  ```
 177  
 178  Deserializes the ComponentTool from a dictionary.
 179  
 180  ## from_function
 181  
 182  ### create_tool_from_function
 183  
 184  ```python
 185  create_tool_from_function(
 186      function: Callable,
 187      name: str | None = None,
 188      description: str | None = None,
 189      inputs_from_state: dict[str, str] | None = None,
 190      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 191      outputs_to_string: dict[str, Any] | None = None,
 192  ) -> Tool
 193  ```
 194  
 195  Create a Tool instance from a function.
 196  
 197  Allows customizing the Tool name and description.
 198  For simpler use cases, consider using the `@tool` decorator.
 199  
 200  ### Usage example
 201  
 202  ```python
 203  from typing import Annotated, Literal
 204  from haystack.tools import create_tool_from_function
 205  
 206  def get_weather(
 207      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 208      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 209      '''A simple function to get the current weather for a location.'''
 210      return f"Weather report for {city}: 20 {unit}, sunny"
 211  
 212  tool = create_tool_from_function(get_weather)
 213  
 214  print(tool)
 215  >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 216  >>> parameters={
 217  >>> 'type': 'object',
 218  >>> 'properties': {
 219  >>>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 220  >>>     'unit': {
 221  >>>         'type': 'string',
 222  >>>         'enum': ['Celsius', 'Fahrenheit'],
 223  >>>         'description': 'the unit for the temperature',
 224  >>>         'default': 'Celsius',
 225  >>>     },
 226  >>>     }
 227  >>> },
 228  >>> function=<function get_weather at 0x7f7b3a8a9b80>)
 229  ```
 230  
 231  **Parameters:**
 232  
 233  - **function** (<code>Callable</code>) – The function to be converted into a Tool.
 234    The function must include type hints for all parameters.
 235    The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple).
 236    Other input types may work but are not guaranteed.
 237    If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description.
 238  - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used.
 239  - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used.
 240    To intentionally leave the description empty, pass an empty string.
 241  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 242    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 243  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 244    If the source is provided only the specified output key is sent to the handler.
 245    Example:
 246  
 247  ```python
 248  {
 249      "documents": {"source": "docs", "handler": custom_handler}
 250  }
 251  ```
 252  
 253  If the source is omitted the whole tool result is sent to the handler.
 254  Example:
 255  
 256  ```python
 257  {
 258      "documents": {"handler": custom_handler}
 259  }
 260  ```
 261  
 262  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 263    If not provided, the tool result is converted to a string using a default handler.
 264  
 265  `outputs_to_string` supports two formats:
 266  
 267  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 268  
 269     ```python
 270     {
 271         "source": "docs", "handler": format_documents, "raw_result": False
 272     }
 273     ```
 274  
 275     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 276       tool result is sent to the handler.
 277     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 278       final result.
 279     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 280       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 281       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 282       Generators.
 283  
 284  1. Multiple output format - map keys to individual configurations:
 285  
 286     ```python
 287     {
 288         "formatted_docs": {"source": "docs", "handler": format_documents},
 289         "summary": {"source": "summary_text", "handler": str.upper}
 290     }
 291     ```
 292  
 293     Each key maps to a dictionary that can contain "source" and/or "handler".
 294     Note that `raw_result` is not supported in the multiple output format.
 295  
 296  **Returns:**
 297  
 298  - <code>Tool</code> – The Tool created from the function.
 299  
 300  **Raises:**
 301  
 302  - <code>ValueError</code> – If any parameter of the function lacks a type hint.
 303  - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool.
 304  
 305  ### tool
 306  
 307  ```python
 308  tool(
 309      function: Callable | None = None,
 310      *,
 311      name: str | None = None,
 312      description: str | None = None,
 313      inputs_from_state: dict[str, str] | None = None,
 314      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 315      outputs_to_string: dict[str, Any] | None = None
 316  ) -> Tool | Callable[[Callable], Tool]
 317  ```
 318  
 319  Decorator to convert a function into a Tool.
 320  
 321  Can be used with or without parameters:
 322  @tool # without parameters
 323  def my_function(): ...
 324  
 325  @tool(name="custom_name") # with parameters
 326  def my_function(): ...
 327  
 328  ### Usage example
 329  
 330  ```python
 331  from typing import Annotated, Literal
 332  from haystack.tools import tool
 333  
 334  @tool
 335  def get_weather(
 336      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 337      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 338      '''A simple function to get the current weather for a location.'''
 339      return f"Weather report for {city}: 20 {unit}, sunny"
 340  
 341  print(get_weather)
 342  >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 343  >>> parameters={
 344  >>> 'type': 'object',
 345  >>> 'properties': {
 346  >>>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 347  >>>     'unit': {
 348  >>>         'type': 'string',
 349  >>>         'enum': ['Celsius', 'Fahrenheit'],
 350  >>>         'description': 'the unit for the temperature',
 351  >>>         'default': 'Celsius',
 352  >>>     },
 353  >>>     }
 354  >>> },
 355  >>> function=<function get_weather at 0x7f7b3a8a9b80>)
 356  ```
 357  
 358  **Parameters:**
 359  
 360  - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters)
 361  - **name** (<code>str | None</code>) – Optional custom name for the tool
 362  - **description** (<code>str | None</code>) – Optional custom description
 363  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 364    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 365  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 366    If the source is provided only the specified output key is sent to the handler.
 367    Example:
 368  
 369  ```python
 370  {
 371      "documents": {"source": "docs", "handler": custom_handler}
 372  }
 373  ```
 374  
 375  If the source is omitted the whole tool result is sent to the handler.
 376  Example:
 377  
 378  ```python
 379  {
 380      "documents": {"handler": custom_handler}
 381  }
 382  ```
 383  
 384  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 385    If not provided, the tool result is converted to a string using a default handler.
 386  
 387  `outputs_to_string` supports two formats:
 388  
 389  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 390  
 391     ```python
 392     {
 393         "source": "docs", "handler": format_documents, "raw_result": False
 394     }
 395     ```
 396  
 397     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 398       tool result is sent to the handler.
 399     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 400       final result.
 401     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 402       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 403       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 404       Generators.
 405  
 406  1. Multiple output format - map keys to individual configurations:
 407  
 408     ```python
 409     {
 410         "formatted_docs": {"source": "docs", "handler": format_documents},
 411         "summary": {"source": "summary_text", "handler": str.upper}
 412     }
 413     ```
 414  
 415     Each key maps to a dictionary that can contain "source" and/or "handler".
 416     Note that `raw_result` is not supported in the multiple output format.
 417  
 418  **Returns:**
 419  
 420  - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one
 421  
 422  ## pipeline_tool
 423  
 424  ### PipelineTool
 425  
 426  Bases: <code>ComponentTool</code>
 427  
 428  A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs.
 429  
 430  PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets,
 431  which are derived from the underlying components in the pipeline.
 432  
 433  Key features:
 434  
 435  - Automatic LLM tool calling schema generation from pipeline inputs
 436  - Description extraction of pipeline inputs based on the underlying component docstrings
 437  
 438  To use PipelineTool, you first need a Haystack pipeline.
 439  Below is an example of creating a PipelineTool
 440  
 441  ## Usage Example:
 442  
 443  ```python
 444  from haystack import Document, Pipeline
 445  from haystack.dataclasses import ChatMessage
 446  from haystack.document_stores.in_memory import InMemoryDocumentStore
 447  from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder
 448  from haystack.components.embedders.sentence_transformers_document_embedder import (
 449      SentenceTransformersDocumentEmbedder
 450  )
 451  from haystack.components.generators.chat import OpenAIChatGenerator
 452  from haystack.components.retrievers import InMemoryEmbeddingRetriever
 453  from haystack.components.agents import Agent
 454  from haystack.tools import PipelineTool
 455  
 456  # Initialize a document store and add some documents
 457  document_store = InMemoryDocumentStore()
 458  document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 459  documents = [
 460      Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."),
 461      Document(
 462          content="He is best known for his contributions to the design of the modern alternating current (AC) "
 463                  "electricity supply system."
 464      ),
 465  ]
 466  docs_with_embeddings = document_embedder.run(documents=documents)["documents"]
 467  document_store.write_documents(docs_with_embeddings)
 468  
 469  # Build a simple retrieval pipeline
 470  retrieval_pipeline = Pipeline()
 471  retrieval_pipeline.add_component(
 472      "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 473  )
 474  retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
 475  
 476  retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding")
 477  
 478  # Wrap the pipeline as a tool
 479  retriever_tool = PipelineTool(
 480      pipeline=retrieval_pipeline,
 481      input_mapping={"query": ["embedder.text"]},
 482      output_mapping={"retriever.documents": "documents"},
 483      name="document_retriever",
 484      description="For any questions about Nikola Tesla, always use this tool",
 485  )
 486  
 487  # Create an Agent with the tool
 488  agent = Agent(
 489      chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
 490      tools=[retriever_tool]
 491  )
 492  
 493  # Let the Agent handle a query
 494  result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")])
 495  
 496  # Print result of the tool call
 497  print("Tool Call Result:")
 498  print(result["messages"][2].tool_call_result.result)
 499  print("")
 500  
 501  # Print answer
 502  print("Answer:")
 503  print(result["messages"][-1].text)
 504  ```
 505  
 506  #### __init__
 507  
 508  ```python
 509  __init__(
 510      pipeline: Pipeline | AsyncPipeline,
 511      *,
 512      name: str,
 513      description: str,
 514      input_mapping: dict[str, list[str]] | None = None,
 515      output_mapping: dict[str, str] | None = None,
 516      parameters: dict[str, Any] | None = None,
 517      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
 518      inputs_from_state: dict[str, str] | None = None,
 519      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
 520  ) -> None
 521  ```
 522  
 523  Create a Tool instance from a Haystack pipeline.
 524  
 525  **Parameters:**
 526  
 527  - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool.
 528  - **name** (<code>str</code>) – Name of the tool.
 529  - **description** (<code>str</code>) – Description of the tool.
 530  - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths.
 531    If not provided, a default input mapping will be created based on all pipeline inputs.
 532    Example:
 533  
 534  ```python
 535  input_mapping={
 536      "query": ["retriever.query", "prompt_builder.query"],
 537  }
 538  ```
 539  
 540  - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names.
 541    If not provided, a default output mapping will be created based on all pipeline outputs.
 542    Example:
 543  
 544  ```python
 545  output_mapping={
 546      "retriever.documents": "documents",
 547      "generator.replies": "replies",
 548  }
 549  ```
 550  
 551  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
 552    Will fall back to the parameters defined in the component's run method signature if not provided.
 553  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 554    If not provided, the tool result is converted to a string using a default handler.
 555  
 556  `outputs_to_string` supports two formats:
 557  
 558  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 559  
 560     ```python
 561     {
 562         "source": "docs", "handler": format_documents, "raw_result": False
 563     }
 564     ```
 565  
 566     - `source`: If provided, only the specified output key is sent to the handler.
 567     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 568       final result.
 569     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 570       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 571       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 572       ensure compatibility with Chat Generators.
 573  
 574  1. Multiple output format - map keys to individual configurations:
 575  
 576     ```python
 577     {
 578         "formatted_docs": {"source": "docs", "handler": format_documents},
 579         "summary": {"source": "summary_text", "handler": str.upper}
 580     }
 581     ```
 582  
 583     Each key maps to a dictionary that can contain "source" and/or "handler".
 584     Note that `raw_result` is not supported in the multiple output format.
 585  
 586  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 587    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 588  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 589    If the source is provided only the specified output key is sent to the handler.
 590    Example:
 591  
 592  ```python
 593  {
 594      "documents": {"source": "docs", "handler": custom_handler}
 595  }
 596  ```
 597  
 598  If the source is omitted the whole tool result is sent to the handler.
 599  Example:
 600  
 601  ```python
 602  {
 603      "documents": {"handler": custom_handler}
 604  }
 605  ```
 606  
 607  **Raises:**
 608  
 609  - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance.
 610  
 611  #### to_dict
 612  
 613  ```python
 614  to_dict() -> dict[str, Any]
 615  ```
 616  
 617  Serializes the PipelineTool to a dictionary.
 618  
 619  **Returns:**
 620  
 621  - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool.
 622  
 623  #### from_dict
 624  
 625  ```python
 626  from_dict(data: dict[str, Any]) -> PipelineTool
 627  ```
 628  
 629  Deserializes the PipelineTool from a dictionary.
 630  
 631  **Parameters:**
 632  
 633  - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool.
 634  
 635  **Returns:**
 636  
 637  - <code>PipelineTool</code> – The deserialized PipelineTool instance.
 638  
 639  ## searchable_toolset
 640  
 641  ### SearchableToolset
 642  
 643  Bases: <code>Toolset</code>
 644  
 645  Dynamic tool discovery from large catalogs using BM25 search.
 646  
 647  This Toolset enables LLMs to discover and use tools from large catalogs through
 648  BM25-based search. Instead of exposing all tools at once (which can overwhelm the
 649  LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to
 650  find and load specific tools as needed.
 651  
 652  For very small catalogs (below `search_threshold`), acts as a simple passthrough
 653  exposing all tools directly without any discovery mechanism.
 654  
 655  ### Usage Example
 656  
 657  ```python
 658  from haystack.components.agents import Agent
 659  from haystack.components.generators.chat import OpenAIChatGenerator
 660  from haystack.dataclasses import ChatMessage
 661  from haystack.tools import Tool, SearchableToolset
 662  
 663  # Create a catalog of tools
 664  catalog = [
 665      Tool(name="get_weather", description="Get weather for a city", ...),
 666      Tool(name="search_web", description="Search the web", ...),
 667      # ... 100s more tools
 668  ]
 669  toolset = SearchableToolset(catalog=catalog)
 670  
 671  agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset)
 672  
 673  # The agent is initially provided only with the search_tools tool and will use it to find relevant tools.
 674  result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")])
 675  ```
 676  
 677  #### __init__
 678  
 679  ```python
 680  __init__(
 681      catalog: ToolsType,
 682      *,
 683      top_k: int = 3,
 684      search_threshold: int = 8,
 685      search_tool_name: str = "search_tools",
 686      search_tool_description: str | None = None,
 687      search_tool_parameters_description: dict[str, str] | None = None
 688  )
 689  ```
 690  
 691  Initialize the SearchableToolset.
 692  
 693  **Parameters:**
 694  
 695  - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset.
 696  - **top_k** (<code>int</code>) – Default number of results for search_tools.
 697  - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search.
 698    If catalog has fewer tools, acts as passthrough (all tools visible).
 699    Default is 8.
 700  - **search_tool_name** (<code>str</code>) – Custom name for the bootstrap search tool. Default is "search_tools".
 701  - **search_tool_description** (<code>str | None</code>) – Custom description for the bootstrap search tool.
 702    If not provided, uses a default description.
 703  - **search_tool_parameters_description** (<code>dict\[str, str\] | None</code>) – Custom descriptions for the bootstrap search tool's parameters.
 704    Keys must be a subset of `{"tool_keywords", "k"}`.
 705    Example: `{"tool_keywords": "Keywords to find tools, e.g. 'email send'"}`
 706  
 707  #### add
 708  
 709  ```python
 710  add(tool: Tool | Toolset) -> None
 711  ```
 712  
 713  Adding new tools after initialization is not supported for SearchableToolset.
 714  
 715  #### warm_up
 716  
 717  ```python
 718  warm_up() -> None
 719  ```
 720  
 721  Prepare the toolset for use.
 722  
 723  Warms up child toolsets first (so lazy toolsets like MCPToolset can connect),
 724  then flattens the catalog, indexes it, and creates the search_tools bootstrap tool.
 725  In passthrough mode, it warms up all catalog tools directly.
 726  Must be called before using the toolset with an Agent.
 727  
 728  #### clear
 729  
 730  ```python
 731  clear() -> None
 732  ```
 733  
 734  Clear all discovered tools.
 735  
 736  This method allows resetting the toolset's discovered tools between agent runs
 737  when the same toolset instance is reused. This can be useful for long-running
 738  applications to control memory usage or to start fresh searches.
 739  
 740  #### to_dict
 741  
 742  ```python
 743  to_dict() -> dict[str, Any]
 744  ```
 745  
 746  Serialize the toolset to a dictionary.
 747  
 748  **Returns:**
 749  
 750  - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset.
 751  
 752  #### from_dict
 753  
 754  ```python
 755  from_dict(data: dict[str, Any]) -> SearchableToolset
 756  ```
 757  
 758  Deserialize a toolset from a dictionary.
 759  
 760  **Parameters:**
 761  
 762  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset.
 763  
 764  **Returns:**
 765  
 766  - <code>SearchableToolset</code> – New SearchableToolset instance.
 767  
 768  ## tool
 769  
 770  ### Tool
 771  
 772  Data class representing a Tool that Language Models can prepare a call for.
 773  
 774  Accurate definitions of the textual attributes such as `name` and `description`
 775  are important for the Language Model to correctly prepare the call.
 776  
 777  For resource-intensive operations like establishing connections to remote services or
 778  loading models, override the `warm_up()` method. This method is called before the Tool
 779  is used and should be idempotent, as it may be called multiple times during
 780  pipeline/agent setup.
 781  
 782  **Parameters:**
 783  
 784  - **name** (<code>str</code>) – Name of the Tool.
 785  - **description** (<code>str</code>) – Description of the Tool.
 786  - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool.
 787  - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called.
 788    Must be a synchronous function; async functions are not supported.
 789  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 790    If not provided, the tool result is converted to a string using a default handler.
 791  
 792  `outputs_to_string` supports two formats:
 793  
 794  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 795  
 796     ```python
 797     {
 798         "source": "docs", "handler": format_documents, "raw_result": False
 799     }
 800     ```
 801  
 802     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 803       tool result is sent to the handler.
 804     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 805       final result.
 806     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 807       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 808       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 809       Generators.
 810  
 811  1. Multiple output format - map keys to individual configurations:
 812  
 813     ```python
 814     {
 815         "formatted_docs": {"source": "docs", "handler": format_documents},
 816         "summary": {"source": "summary_text", "handler": str.upper}
 817     }
 818     ```
 819  
 820     Each key maps to a dictionary that can contain "source" and/or "handler".
 821     Note that `raw_result` is not supported in the multiple output format.
 822  
 823  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 824    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 825  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 826    If the source is provided only the specified output key is sent to the handler.
 827    Example:
 828  
 829  ```python
 830  {
 831      "documents": {"source": "docs", "handler": custom_handler}
 832  }
 833  ```
 834  
 835  If the source is omitted the whole tool result is sent to the handler.
 836  Example:
 837  
 838  ```python
 839  {
 840      "documents": {"handler": custom_handler}
 841  }
 842  ```
 843  
 844  **Raises:**
 845  
 846  - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the
 847    `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid.
 848  - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or
 849    `inputs_from_state` has the wrong type.
 850  
 851  #### tool_spec
 852  
 853  ```python
 854  tool_spec: dict[str, Any]
 855  ```
 856  
 857  Return the Tool specification to be used by the Language Model.
 858  
 859  #### warm_up
 860  
 861  ```python
 862  warm_up() -> None
 863  ```
 864  
 865  Prepare the Tool for use.
 866  
 867  Override this method to establish connections to remote services, load models,
 868  or perform other resource-intensive initialization. This method should be idempotent,
 869  as it may be called multiple times.
 870  
 871  #### invoke
 872  
 873  ```python
 874  invoke(**kwargs: Any) -> Any
 875  ```
 876  
 877  Invoke the Tool with the provided keyword arguments.
 878  
 879  #### to_dict
 880  
 881  ```python
 882  to_dict() -> dict[str, Any]
 883  ```
 884  
 885  Serializes the Tool to a dictionary.
 886  
 887  **Returns:**
 888  
 889  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 890  
 891  #### from_dict
 892  
 893  ```python
 894  from_dict(data: dict[str, Any]) -> Tool
 895  ```
 896  
 897  Deserializes the Tool from a dictionary.
 898  
 899  **Parameters:**
 900  
 901  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 902  
 903  **Returns:**
 904  
 905  - <code>Tool</code> – Deserialized Tool.
 906  
 907  ## toolset
 908  
 909  ### Toolset
 910  
 911  A collection of related Tools that can be used and managed as a cohesive unit.
 912  
 913  Toolset serves two main purposes:
 914  
 915  1. Group related tools together:
 916     Toolset allows you to organize related tools into a single collection, making it easier
 917     to manage and use them as a unit in Haystack pipelines.
 918  
 919     Example:
 920  
 921     ```python
 922     from haystack.tools import Tool, Toolset
 923     from haystack.components.tools import ToolInvoker
 924  
 925     # Define math functions
 926     def add_numbers(a: int, b: int) -> int:
 927         return a + b
 928  
 929     def subtract_numbers(a: int, b: int) -> int:
 930         return a - b
 931  
 932     # Create tools with proper schemas
 933     add_tool = Tool(
 934         name="add",
 935         description="Add two numbers",
 936         parameters={
 937             "type": "object",
 938             "properties": {
 939                 "a": {"type": "integer"},
 940                 "b": {"type": "integer"}
 941             },
 942             "required": ["a", "b"]
 943         },
 944         function=add_numbers
 945     )
 946  
 947     subtract_tool = Tool(
 948         name="subtract",
 949         description="Subtract b from a",
 950         parameters={
 951             "type": "object",
 952             "properties": {
 953                 "a": {"type": "integer"},
 954                 "b": {"type": "integer"}
 955             },
 956             "required": ["a", "b"]
 957         },
 958         function=subtract_numbers
 959     )
 960  
 961     # Create a toolset with the math tools
 962     math_toolset = Toolset([add_tool, subtract_tool])
 963  
 964     # Use the toolset with a ToolInvoker or ChatGenerator component
 965     invoker = ToolInvoker(tools=math_toolset)
 966     ```
 967  
 968  1. Base class for dynamic tool loading:
 969     By subclassing Toolset, you can create implementations that dynamically load tools
 970     from external sources like OpenAPI URLs, MCP servers, or other resources.
 971  
 972     Example:
 973  
 974     ```python
 975     from haystack.core.serialization import generate_qualified_class_name
 976     from haystack.tools import Tool, Toolset
 977     from haystack.components.tools import ToolInvoker
 978  
 979     class CalculatorToolset(Toolset):
 980         '''A toolset for calculator operations.'''
 981  
 982         def __init__(self):
 983             tools = self._create_tools()
 984             super().__init__(tools)
 985  
 986         def _create_tools(self):
 987             # These Tool instances are obviously defined statically and for illustration purposes only.
 988             # In a real-world scenario, you would dynamically load tools from an external source here.
 989             tools = []
 990             add_tool = Tool(
 991                 name="add",
 992                 description="Add two numbers",
 993                 parameters={
 994                     "type": "object",
 995                     "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
 996                     "required": ["a", "b"],
 997                 },
 998                 function=lambda a, b: a + b,
 999             )
1000  
1001             multiply_tool = Tool(
1002                 name="multiply",
1003                 description="Multiply two numbers",
1004                 parameters={
1005                     "type": "object",
1006                     "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
1007                     "required": ["a", "b"],
1008                 },
1009                 function=lambda a, b: a * b,
1010             )
1011  
1012             tools.append(add_tool)
1013             tools.append(multiply_tool)
1014  
1015             return tools
1016  
1017         def to_dict(self):
1018             return {
1019                 "type": generate_qualified_class_name(type(self)),
1020                 "data": {},  # no data to serialize as we define the tools dynamically
1021             }
1022  
1023         @classmethod
1024         def from_dict(cls, data):
1025             return cls()  # Recreate the tools dynamically during deserialization
1026  
1027     # Create the dynamic toolset and use it with ToolInvoker
1028     calculator_toolset = CalculatorToolset()
1029     invoker = ToolInvoker(tools=calculator_toolset)
1030     ```
1031  
1032  Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__),
1033  making it behave like a list of Tools. This makes it compatible with components that expect
1034  iterable tools, such as ToolInvoker or Haystack chat generators.
1035  
1036  When implementing a custom Toolset subclass for dynamic tool loading:
1037  
1038  - Perform the dynamic loading in the __init__ method
1039  - Override to_dict() and from_dict() methods if your tools are defined dynamically
1040  - Serialize endpoint descriptors rather than tool instances if your tools
1041    are loaded from external sources
1042  
1043  #### warm_up
1044  
1045  ```python
1046  warm_up() -> None
1047  ```
1048  
1049  Prepare the Toolset for use.
1050  
1051  By default, this method iterates through and warms up all tools in the Toolset.
1052  Subclasses can override this method to customize initialization behavior, such as:
1053  
1054  - Setting up shared resources (database connections, HTTP sessions) instead of
1055    warming individual tools
1056  - Implementing custom initialization logic for dynamically loaded tools
1057  - Controlling when and how tools are initialized
1058  
1059  For example, a Toolset that manages tools from an external service (like MCPToolset)
1060  might override this to initialize a shared connection rather than warming up
1061  individual tools:
1062  
1063  ```python
1064  class MCPToolset(Toolset):
1065      def warm_up(self) -> None:
1066          # Only warm up the shared MCP connection, not individual tools
1067          self.mcp_connection = establish_connection(self.server_url)
1068  ```
1069  
1070  This method should be idempotent, as it may be called multiple times.
1071  
1072  #### add
1073  
1074  ```python
1075  add(tool: Union[Tool, Toolset]) -> None
1076  ```
1077  
1078  Add a new Tool or merge another Toolset.
1079  
1080  **Parameters:**
1081  
1082  - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add
1083  
1084  **Raises:**
1085  
1086  - <code>ValueError</code> – If adding the tool would result in duplicate tool names
1087  - <code>TypeError</code> – If the provided object is not a Tool or Toolset
1088  
1089  #### to_dict
1090  
1091  ```python
1092  to_dict() -> dict[str, Any]
1093  ```
1094  
1095  Serialize the Toolset to a dictionary.
1096  
1097  **Returns:**
1098  
1099  - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset
1100  
1101  Note for subclass implementers:
1102  The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass
1103  of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or
1104  a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool
1105  instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead
1106  associated with serializing potentially large collections of Tool objects. Moreover, by serializing the
1107  descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even
1108  if they have been modified or removed since the last serialization. Failing to serialize the descriptor may
1109  lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or
1110  unexpected behavior.
1111  
1112  #### from_dict
1113  
1114  ```python
1115  from_dict(data: dict[str, Any]) -> Toolset
1116  ```
1117  
1118  Deserialize a Toolset from a dictionary.
1119  
1120  **Parameters:**
1121  
1122  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset
1123  
1124  **Returns:**
1125  
1126  - <code>Toolset</code> – A new Toolset instance