Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.27 / haystack-api / tools_api.md
tools_api.md
   1  ---
   2  title: "Tools"
   3  id: tools-api
   4  description: "Unified abstractions to represent tools across the framework."
   5  slug: "/tools-api"
   6  ---
   7  
   8  
   9  ## component_tool
  10  
  11  ### ComponentTool
  12  
  13  Bases: <code>Tool</code>
  14  
  15  A Tool that wraps Haystack components, allowing them to be used as tools by LLMs.
  16  
  17  ComponentTool automatically generates LLM-compatible tool schemas from component input sockets,
  18  which are derived from the component's `run` method signature and type hints.
  19  
  20  Key features:
  21  
  22  - Automatic LLM tool calling schema generation from component input sockets
  23  - Type conversion and validation for component inputs
  24  - Support for types:
  25    - Dataclasses
  26    - Lists of dataclasses
  27    - Basic types (str, int, float, bool, dict)
  28    - Lists of basic types
  29  - Automatic name generation from component class name
  30  - Description extraction from component docstrings
  31  
  32  To use ComponentTool, you first need a Haystack component - either an existing one or a new one you create.
  33  You can create a ComponentTool from the component by passing the component to the ComponentTool constructor.
  34  Below is an example of creating a ComponentTool from an existing SerperDevWebSearch component.
  35  
  36  ## Usage Example:
  37  
  38  ```python
  39  from haystack import component, Pipeline
  40  from haystack.tools import ComponentTool
  41  from haystack.components.websearch import SerperDevWebSearch
  42  from haystack.utils import Secret
  43  from haystack.components.tools.tool_invoker import ToolInvoker
  44  from haystack.components.generators.chat import OpenAIChatGenerator
  45  from haystack.dataclasses import ChatMessage
  46  
  47  # Create a SerperDev search component
  48  search = SerperDevWebSearch(api_key=Secret.from_env_var("SERPERDEV_API_KEY"), top_k=3)
  49  
  50  # Create a tool from the component
  51  tool = ComponentTool(
  52      component=search,
  53      name="web_search",  # Optional: defaults to "serper_dev_web_search"
  54      description="Search the web for current information on any topic"  # Optional: defaults to component docstring
  55  )
  56  
  57  # Create pipeline with OpenAIChatGenerator and ToolInvoker
  58  pipeline = Pipeline()
  59  pipeline.add_component("llm", OpenAIChatGenerator(tools=[tool]))
  60  pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool]))
  61  
  62  # Connect components
  63  pipeline.connect("llm.replies", "tool_invoker.messages")
  64  
  65  message = ChatMessage.from_user("Use the web search tool to find information about Nikola Tesla")
  66  
  67  # Run pipeline
  68  result = pipeline.run({"llm": {"messages": [message]}})
  69  
  70  print(result)
  71  ```
  72  
  73  #### __init__
  74  
  75  ```python
  76  __init__(
  77      component: Component,
  78      name: str | None = None,
  79      description: str | None = None,
  80      parameters: dict[str, Any] | None = None,
  81      *,
  82      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
  83      inputs_from_state: dict[str, str] | None = None,
  84      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
  85  ) -> None
  86  ```
  87  
  88  Create a Tool instance from a Haystack component.
  89  
  90  **Parameters:**
  91  
  92  - **component** (<code>Component</code>) – The Haystack component to wrap as a tool.
  93  - **name** (<code>str | None</code>) – Optional name for the tool (defaults to snake_case of component class name).
  94  - **description** (<code>str | None</code>) – Optional description (defaults to component's docstring).
  95  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
  96    Will fall back to the parameters defined in the component's run method signature if not provided.
  97  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
  98    If not provided, the tool result is converted to a string using a default handler.
  99  
 100  `outputs_to_string` supports two formats:
 101  
 102  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 103  
 104     ```python
 105     {
 106         "source": "docs", "handler": format_documents, "raw_result": False
 107     }
 108     ```
 109  
 110     - `source`: If provided, only the specified output key is sent to the handler.
 111     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 112       final result.
 113     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 114       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 115       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 116       ensure compatibility with Chat Generators.
 117  
 118  1. Multiple output format - map keys to individual configurations:
 119  
 120     ```python
 121     {
 122         "formatted_docs": {"source": "docs", "handler": format_documents},
 123         "summary": {"source": "summary_text", "handler": str.upper}
 124     }
 125     ```
 126  
 127     Each key maps to a dictionary that can contain "source" and/or "handler".
 128     Note that `raw_result` is not supported in the multiple output format.
 129  
 130  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 131    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 132  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 133    If the source is provided only the specified output key is sent to the handler.
 134    Example:
 135  
 136  ```python
 137  {
 138      "documents": {"source": "docs", "handler": custom_handler}
 139  }
 140  ```
 141  
 142  If the source is omitted the whole tool result is sent to the handler.
 143  Example:
 144  
 145  ```python
 146  {
 147      "documents": {"handler": custom_handler}
 148  }
 149  ```
 150  
 151  **Raises:**
 152  
 153  - <code>TypeError</code> – If the object passed is not a Haystack Component instance.
 154  - <code>ValueError</code> – If the component has already been added to a pipeline, or if schema generation fails.
 155  
 156  #### warm_up
 157  
 158  ```python
 159  warm_up() -> None
 160  ```
 161  
 162  Prepare the ComponentTool for use.
 163  
 164  #### to_dict
 165  
 166  ```python
 167  to_dict() -> dict[str, Any]
 168  ```
 169  
 170  Serializes the ComponentTool to a dictionary.
 171  
 172  #### from_dict
 173  
 174  ```python
 175  from_dict(data: dict[str, Any]) -> ComponentTool
 176  ```
 177  
 178  Deserializes the ComponentTool from a dictionary.
 179  
 180  ## from_function
 181  
 182  ### create_tool_from_function
 183  
 184  ```python
 185  create_tool_from_function(
 186      function: Callable,
 187      name: str | None = None,
 188      description: str | None = None,
 189      inputs_from_state: dict[str, str] | None = None,
 190      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 191      outputs_to_string: dict[str, Any] | None = None,
 192  ) -> Tool
 193  ```
 194  
 195  Create a Tool instance from a function.
 196  
 197  Allows customizing the Tool name and description.
 198  For simpler use cases, consider using the `@tool` decorator.
 199  
 200  ### Usage example
 201  
 202  ```python
 203  from typing import Annotated, Literal
 204  from haystack.tools import create_tool_from_function
 205  
 206  def get_weather(
 207      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 208      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 209      '''A simple function to get the current weather for a location.'''
 210      return f"Weather report for {city}: 20 {unit}, sunny"
 211  
 212  tool = create_tool_from_function(get_weather)
 213  
 214  print(tool)
 215  # >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 216  # >>> parameters={
 217  # >>> 'type': 'object',
 218  # >>> 'properties': {
 219  # >>>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 220  # >>>     'unit': {
 221  # >>>         'type': 'string',
 222  # >>>         'enum': ['Celsius', 'Fahrenheit'],
 223  # >>>         'description': 'the unit for the temperature',
 224  # >>>         'default': 'Celsius',
 225  # >>>     },
 226  # >>>     }
 227  # >>> },
 228  # >>> function=<function get_weather at 0x7f7b3a8a9b80>)
 229  ```
 230  
 231  **Parameters:**
 232  
 233  - **function** (<code>Callable</code>) – The function to be converted into a Tool.
 234    The function must include type hints for all parameters.
 235    The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple).
 236    Other input types may work but are not guaranteed.
 237    If a parameter is annotated using `typing.Annotated`, its metadata will be used as parameter description.
 238  - **name** (<code>str | None</code>) – The name of the Tool. If not provided, the name of the function will be used.
 239  - **description** (<code>str | None</code>) – The description of the Tool. If not provided, the docstring of the function will be used.
 240    To intentionally leave the description empty, pass an empty string.
 241  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 242    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 243  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 244    If the source is provided only the specified output key is sent to the handler.
 245    Example:
 246  
 247  ```python
 248  {
 249      "documents": {"source": "docs", "handler": custom_handler}
 250  }
 251  ```
 252  
 253  If the source is omitted the whole tool result is sent to the handler.
 254  Example:
 255  
 256  ```python
 257  {
 258      "documents": {"handler": custom_handler}
 259  }
 260  ```
 261  
 262  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 263    If not provided, the tool result is converted to a string using a default handler.
 264  
 265  `outputs_to_string` supports two formats:
 266  
 267  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 268  
 269     ```python
 270     {
 271         "source": "docs", "handler": format_documents, "raw_result": False
 272     }
 273     ```
 274  
 275     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 276       tool result is sent to the handler.
 277     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 278       final result.
 279     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 280       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 281       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 282       Generators.
 283  
 284  1. Multiple output format - map keys to individual configurations:
 285  
 286     ```python
 287     {
 288         "formatted_docs": {"source": "docs", "handler": format_documents},
 289         "summary": {"source": "summary_text", "handler": str.upper}
 290     }
 291     ```
 292  
 293     Each key maps to a dictionary that can contain "source" and/or "handler".
 294     Note that `raw_result` is not supported in the multiple output format.
 295  
 296  **Returns:**
 297  
 298  - <code>Tool</code> – The Tool created from the function.
 299  
 300  **Raises:**
 301  
 302  - <code>ValueError</code> – If any parameter of the function lacks a type hint.
 303  - <code>SchemaGenerationError</code> – If there is an error generating the JSON schema for the Tool.
 304  
 305  ### tool
 306  
 307  ```python
 308  tool(
 309      function: Callable | None = None,
 310      *,
 311      name: str | None = None,
 312      description: str | None = None,
 313      inputs_from_state: dict[str, str] | None = None,
 314      outputs_to_state: dict[str, dict[str, Any]] | None = None,
 315      outputs_to_string: dict[str, Any] | None = None
 316  ) -> Tool | Callable[[Callable], Tool]
 317  ```
 318  
 319  Decorator to convert a function into a Tool.
 320  
 321  Can be used with or without parameters:
 322  @tool # without parameters
 323  def my_function(): ...
 324  
 325  @tool(name="custom_name") # with parameters
 326  def my_function(): ...
 327  
 328  ### Usage example
 329  
 330  ```python
 331  from typing import Annotated, Literal
 332  from haystack.tools import tool
 333  
 334  @tool
 335  def get_weather(
 336      city: Annotated[str, "the city for which to get the weather"] = "Munich",
 337      unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius"):
 338      '''A simple function to get the current weather for a location.'''
 339      return f"Weather report for {city}: 20 {unit}, sunny"
 340  
 341  print(get_weather)
 342  # >>> Tool(name='get_weather', description='A simple function to get the current weather for a location.',
 343  # >>> parameters={
 344  # >>> 'type': 'object',
 345  # >>> 'properties': {
 346  # >>>     'city': {'type': 'string', 'description': 'the city for which to get the weather', 'default': 'Munich'},
 347  # >>>     'unit': {
 348  # >>>         'type': 'string',
 349  # >>>         'enum': ['Celsius', 'Fahrenheit'],
 350  # >>>         'description': 'the unit for the temperature',
 351  # >>>         'default': 'Celsius',
 352  # >>>     },
 353  # >>>     }
 354  # >>> },
 355  # >>> function=<function get_weather at 0x7f7b3a8a9b80>)
 356  ```
 357  
 358  **Parameters:**
 359  
 360  - **function** (<code>Callable | None</code>) – The function to decorate (when used without parameters)
 361  - **name** (<code>str | None</code>) – Optional custom name for the tool
 362  - **description** (<code>str | None</code>) – Optional custom description
 363  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 364    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 365  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 366    If the source is provided only the specified output key is sent to the handler.
 367    Example:
 368  
 369  ```python
 370  {
 371      "documents": {"source": "docs", "handler": custom_handler}
 372  }
 373  ```
 374  
 375  If the source is omitted the whole tool result is sent to the handler.
 376  Example:
 377  
 378  ```python
 379  {
 380      "documents": {"handler": custom_handler}
 381  }
 382  ```
 383  
 384  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 385    If not provided, the tool result is converted to a string using a default handler.
 386  
 387  `outputs_to_string` supports two formats:
 388  
 389  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 390  
 391     ```python
 392     {
 393         "source": "docs", "handler": format_documents, "raw_result": False
 394     }
 395     ```
 396  
 397     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 398       tool result is sent to the handler.
 399     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 400       final result.
 401     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 402       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 403       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 404       Generators.
 405  
 406  1. Multiple output format - map keys to individual configurations:
 407  
 408     ```python
 409     {
 410         "formatted_docs": {"source": "docs", "handler": format_documents},
 411         "summary": {"source": "summary_text", "handler": str.upper}
 412     }
 413     ```
 414  
 415     Each key maps to a dictionary that can contain "source" and/or "handler".
 416     Note that `raw_result` is not supported in the multiple output format.
 417  
 418  **Returns:**
 419  
 420  - <code>Tool | Callable\\[[Callable\], Tool\]</code> – Either a Tool instance or a decorator function that will create one
 421  
 422  ## pipeline_tool
 423  
 424  ### PipelineTool
 425  
 426  Bases: <code>ComponentTool</code>
 427  
 428  A Tool that wraps Haystack Pipelines, allowing them to be used as tools by LLMs.
 429  
 430  PipelineTool automatically generates LLM-compatible tool schemas from pipeline input sockets,
 431  which are derived from the underlying components in the pipeline.
 432  
 433  Key features:
 434  
 435  - Automatic LLM tool calling schema generation from pipeline inputs
 436  - Description extraction of pipeline inputs based on the underlying component docstrings
 437  
 438  To use PipelineTool, you first need a Haystack pipeline.
 439  Below is an example of creating a PipelineTool
 440  
 441  ## Usage Example:
 442  
 443  ```python
 444  from haystack import Document, Pipeline
 445  from haystack.dataclasses import ChatMessage
 446  from haystack.document_stores.in_memory import InMemoryDocumentStore
 447  from haystack.components.embedders.sentence_transformers_text_embedder import SentenceTransformersTextEmbedder
 448  from haystack.components.embedders.sentence_transformers_document_embedder import (
 449      SentenceTransformersDocumentEmbedder
 450  )
 451  from haystack.components.generators.chat import OpenAIChatGenerator
 452  from haystack.components.retrievers import InMemoryEmbeddingRetriever
 453  from haystack.components.agents import Agent
 454  from haystack.tools import PipelineTool
 455  
 456  # Initialize a document store and add some documents
 457  document_store = InMemoryDocumentStore()
 458  document_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 459  documents = [
 460      Document(content="Nikola Tesla was a Serbian-American inventor and electrical engineer."),
 461      Document(
 462          content="He is best known for his contributions to the design of the modern alternating current (AC) "
 463                  "electricity supply system."
 464      ),
 465  ]
 466  docs_with_embeddings = document_embedder.run(documents=documents)["documents"]
 467  document_store.write_documents(docs_with_embeddings)
 468  
 469  # Build a simple retrieval pipeline
 470  retrieval_pipeline = Pipeline()
 471  retrieval_pipeline.add_component(
 472      "embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
 473  )
 474  retrieval_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
 475  
 476  retrieval_pipeline.connect("embedder.embedding", "retriever.query_embedding")
 477  
 478  # Wrap the pipeline as a tool
 479  retriever_tool = PipelineTool(
 480      pipeline=retrieval_pipeline,
 481      input_mapping={"query": ["embedder.text"]},
 482      output_mapping={"retriever.documents": "documents"},
 483      name="document_retriever",
 484      description="For any questions about Nikola Tesla, always use this tool",
 485  )
 486  
 487  # Create an Agent with the tool
 488  agent = Agent(
 489      chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
 490      tools=[retriever_tool]
 491  )
 492  
 493  # Let the Agent handle a query
 494  result = agent.run([ChatMessage.from_user("Who was Nikola Tesla?")])
 495  
 496  # Print result of the tool call
 497  print("Tool Call Result:")
 498  print(result["messages"][2].tool_call_result.result)
 499  print("")
 500  
 501  # Print answer
 502  print("Answer:")
 503  print(result["messages"][-1].text)
 504  ```
 505  
 506  #### __init__
 507  
 508  ```python
 509  __init__(
 510      pipeline: Pipeline | AsyncPipeline,
 511      *,
 512      name: str,
 513      description: str,
 514      input_mapping: dict[str, list[str]] | None = None,
 515      output_mapping: dict[str, str] | None = None,
 516      parameters: dict[str, Any] | None = None,
 517      outputs_to_string: dict[str, str | Callable[[Any], str]] | None = None,
 518      inputs_from_state: dict[str, str] | None = None,
 519      outputs_to_state: dict[str, dict[str, str | Callable]] | None = None
 520  ) -> None
 521  ```
 522  
 523  Create a Tool instance from a Haystack pipeline.
 524  
 525  **Parameters:**
 526  
 527  - **pipeline** (<code>Pipeline | AsyncPipeline</code>) – The Haystack pipeline to wrap as a tool.
 528  - **name** (<code>str</code>) – Name of the tool.
 529  - **description** (<code>str</code>) – Description of the tool.
 530  - **input_mapping** (<code>dict\[str, list\[str\]\] | None</code>) – A dictionary mapping component input names to pipeline input socket paths.
 531    If not provided, a default input mapping will be created based on all pipeline inputs.
 532    Example:
 533  
 534  ```python
 535  input_mapping={
 536      "query": ["retriever.query", "prompt_builder.query"],
 537  }
 538  ```
 539  
 540  - **output_mapping** (<code>dict\[str, str\] | None</code>) – A dictionary mapping pipeline output socket paths to component output names.
 541    If not provided, a default output mapping will be created based on all pipeline outputs.
 542    Example:
 543  
 544  ```python
 545  output_mapping={
 546      "retriever.documents": "documents",
 547      "generator.replies": "replies",
 548  }
 549  ```
 550  
 551  - **parameters** (<code>dict\[str, Any\] | None</code>) – A JSON schema defining the parameters expected by the Tool.
 552    Will fall back to the parameters defined in the component's run method signature if not provided.
 553  - **outputs_to_string** (<code>dict\[str, str | Callable\\[[Any\], str\]\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 554    If not provided, the tool result is converted to a string using a default handler.
 555  
 556  `outputs_to_string` supports two formats:
 557  
 558  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 559  
 560     ```python
 561     {
 562         "source": "docs", "handler": format_documents, "raw_result": False
 563     }
 564     ```
 565  
 566     - `source`: If provided, only the specified output key is sent to the handler.
 567     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 568       final result.
 569     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the
 570       `handler` if provided. This is intended for tools that return images. In this mode, the Tool
 571       function or the `handler` function must return a list of `TextContent`/`ImageContent` objects to
 572       ensure compatibility with Chat Generators.
 573  
 574  1. Multiple output format - map keys to individual configurations:
 575  
 576     ```python
 577     {
 578         "formatted_docs": {"source": "docs", "handler": format_documents},
 579         "summary": {"source": "summary_text", "handler": str.upper}
 580     }
 581     ```
 582  
 583     Each key maps to a dictionary that can contain "source" and/or "handler".
 584     Note that `raw_result` is not supported in the multiple output format.
 585  
 586  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 587    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 588  - **outputs_to_state** (<code>dict\[str, dict\[str, str | Callable\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 589    If the source is provided only the specified output key is sent to the handler.
 590    Example:
 591  
 592  ```python
 593  {
 594      "documents": {"source": "docs", "handler": custom_handler}
 595  }
 596  ```
 597  
 598  If the source is omitted the whole tool result is sent to the handler.
 599  Example:
 600  
 601  ```python
 602  {
 603      "documents": {"handler": custom_handler}
 604  }
 605  ```
 606  
 607  **Raises:**
 608  
 609  - <code>ValueError</code> – If the provided pipeline is not a valid Haystack Pipeline instance.
 610  
 611  #### to_dict
 612  
 613  ```python
 614  to_dict() -> dict[str, Any]
 615  ```
 616  
 617  Serializes the PipelineTool to a dictionary.
 618  
 619  **Returns:**
 620  
 621  - <code>dict\[str, Any\]</code> – The serialized dictionary representation of PipelineTool.
 622  
 623  #### from_dict
 624  
 625  ```python
 626  from_dict(data: dict[str, Any]) -> PipelineTool
 627  ```
 628  
 629  Deserializes the PipelineTool from a dictionary.
 630  
 631  **Parameters:**
 632  
 633  - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of PipelineTool.
 634  
 635  **Returns:**
 636  
 637  - <code>PipelineTool</code> – The deserialized PipelineTool instance.
 638  
 639  ## searchable_toolset
 640  
 641  ### SearchableToolset
 642  
 643  Bases: <code>Toolset</code>
 644  
 645  Dynamic tool discovery from large catalogs using BM25 search.
 646  
 647  This Toolset enables LLMs to discover and use tools from large catalogs through
 648  BM25-based search. Instead of exposing all tools at once (which can overwhelm the
 649  LLM context), it provides a `search_tools` bootstrap tool that allows the LLM to
 650  find and load specific tools as needed.
 651  
 652  For very small catalogs (below `search_threshold`), acts as a simple passthrough
 653  exposing all tools directly without any discovery mechanism.
 654  
 655  ### Usage Example
 656  
 657  ```python
 658  from haystack.components.agents import Agent
 659  from haystack.components.generators.chat import OpenAIChatGenerator
 660  from haystack.dataclasses import ChatMessage
 661  from haystack.tools import Tool, SearchableToolset
 662  
 663  # Create a catalog of tools
 664  catalog = [
 665      Tool(name="get_weather", description="Get weather for a city",
 666           parameters={}, function=lambda: None),
 667      Tool(name="search_web", description="Search the web",
 668           parameters={}, function=lambda: None),
 669      # ... 100s more tools
 670  ]
 671  toolset = SearchableToolset(catalog=catalog)
 672  
 673  agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset)
 674  
 675  # The agent is initially provided only with the search_tools tool and will use it to find relevant tools.
 676  result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")])
 677  ```
 678  
 679  #### __init__
 680  
 681  ```python
 682  __init__(
 683      catalog: ToolsType,
 684      *,
 685      top_k: int = 3,
 686      search_threshold: int = 8,
 687      search_tool_name: str = "search_tools",
 688      search_tool_description: str | None = None,
 689      search_tool_parameters_description: dict[str, str] | None = None
 690  ) -> None
 691  ```
 692  
 693  Initialize the SearchableToolset.
 694  
 695  **Parameters:**
 696  
 697  - **catalog** (<code>ToolsType</code>) – Source of tools - a list of Tools, list of Toolsets, or a single Toolset.
 698  - **top_k** (<code>int</code>) – Default number of results for search_tools.
 699  - **search_threshold** (<code>int</code>) – Minimum catalog size to activate search.
 700    If catalog has fewer tools, acts as passthrough (all tools visible).
 701    Default is 8.
 702  - **search_tool_name** (<code>str</code>) – Custom name for the bootstrap search tool. Default is "search_tools".
 703  - **search_tool_description** (<code>str | None</code>) – Custom description for the bootstrap search tool.
 704    If not provided, uses a default description.
 705  - **search_tool_parameters_description** (<code>dict\[str, str\] | None</code>) – Custom descriptions for the bootstrap search tool's parameters.
 706    Keys must be a subset of `{"tool_keywords", "k"}`.
 707    Example: `{"tool_keywords": "Keywords to find tools, e.g. 'email send'"}`
 708  
 709  #### add
 710  
 711  ```python
 712  add(tool: Tool | Toolset) -> None
 713  ```
 714  
 715  Adding new tools after initialization is not supported for SearchableToolset.
 716  
 717  #### warm_up
 718  
 719  ```python
 720  warm_up() -> None
 721  ```
 722  
 723  Prepare the toolset for use.
 724  
 725  Warms up child toolsets first (so lazy toolsets like MCPToolset can connect),
 726  then flattens the catalog, indexes it, and creates the search_tools bootstrap tool.
 727  In passthrough mode, it warms up all catalog tools directly.
 728  Must be called before using the toolset with an Agent.
 729  
 730  #### clear
 731  
 732  ```python
 733  clear() -> None
 734  ```
 735  
 736  Clear all discovered tools.
 737  
 738  This method allows resetting the toolset's discovered tools between agent runs
 739  when the same toolset instance is reused. This can be useful for long-running
 740  applications to control memory usage or to start fresh searches.
 741  
 742  #### to_dict
 743  
 744  ```python
 745  to_dict() -> dict[str, Any]
 746  ```
 747  
 748  Serialize the toolset to a dictionary.
 749  
 750  **Returns:**
 751  
 752  - <code>dict\[str, Any\]</code> – Dictionary representation of the toolset.
 753  
 754  #### from_dict
 755  
 756  ```python
 757  from_dict(data: dict[str, Any]) -> SearchableToolset
 758  ```
 759  
 760  Deserialize a toolset from a dictionary.
 761  
 762  **Parameters:**
 763  
 764  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the toolset.
 765  
 766  **Returns:**
 767  
 768  - <code>SearchableToolset</code> – New SearchableToolset instance.
 769  
 770  ## tool
 771  
 772  ### Tool
 773  
 774  Data class representing a Tool that Language Models can prepare a call for.
 775  
 776  Accurate definitions of the textual attributes such as `name` and `description`
 777  are important for the Language Model to correctly prepare the call.
 778  
 779  For resource-intensive operations like establishing connections to remote services or
 780  loading models, override the `warm_up()` method. This method is called before the Tool
 781  is used and should be idempotent, as it may be called multiple times during
 782  pipeline/agent setup.
 783  
 784  **Parameters:**
 785  
 786  - **name** (<code>str</code>) – Name of the Tool.
 787  - **description** (<code>str</code>) – Description of the Tool.
 788  - **parameters** (<code>dict\[str, Any\]</code>) – A JSON schema defining the parameters expected by the Tool.
 789  - **function** (<code>Callable</code>) – The function that will be invoked when the Tool is called.
 790    Must be a synchronous function; async functions are not supported.
 791  - **outputs_to_string** (<code>dict\[str, Any\] | None</code>) – Optional dictionary defining how tool outputs should be converted into string(s) or results.
 792    If not provided, the tool result is converted to a string using a default handler.
 793  
 794  `outputs_to_string` supports two formats:
 795  
 796  1. Single output format - use "source", "handler", and/or "raw_result" at the root level:
 797  
 798     ```python
 799     {
 800         "source": "docs", "handler": format_documents, "raw_result": False
 801     }
 802     ```
 803  
 804     - `source`: If provided, only the specified output key is sent to the handler. If not provided, the whole
 805       tool result is sent to the handler.
 806     - `handler`: A function that takes the tool output (or the extracted source value) and returns the
 807       final result.
 808     - `raw_result`: If `True`, the result is returned raw without string conversion, but applying the `handler`
 809       if provided. This is intended for tools that return images. In this mode, the Tool function or the
 810       `handler` must return a list of `TextContent`/`ImageContent` objects to ensure compatibility with Chat
 811       Generators.
 812  
 813  1. Multiple output format - map keys to individual configurations:
 814  
 815     ```python
 816     {
 817         "formatted_docs": {"source": "docs", "handler": format_documents},
 818         "summary": {"source": "summary_text", "handler": str.upper}
 819     }
 820     ```
 821  
 822     Each key maps to a dictionary that can contain "source" and/or "handler".
 823     Note that `raw_result` is not supported in the multiple output format.
 824  
 825  - **inputs_from_state** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping state keys to tool parameter names.
 826    Example: `{"repository": "repo"}` maps state's "repository" to tool's "repo" parameter.
 827  - **outputs_to_state** (<code>dict\[str, dict\[str, Any\]\] | None</code>) – Optional dictionary defining how tool outputs map to keys within state as well as optional handlers.
 828    If the source is provided only the specified output key is sent to the handler.
 829    Example:
 830  
 831  ```python
 832  {
 833      "documents": {"source": "docs", "handler": custom_handler}
 834  }
 835  ```
 836  
 837  If the source is omitted the whole tool result is sent to the handler.
 838  Example:
 839  
 840  ```python
 841  {
 842      "documents": {"handler": custom_handler}
 843  }
 844  ```
 845  
 846  **Raises:**
 847  
 848  - <code>ValueError</code> – If `function` is async, if `parameters` is not a valid JSON schema, or if the
 849    `outputs_to_state`, `outputs_to_string`, or `inputs_from_state` configurations are invalid.
 850  - <code>TypeError</code> – If any configuration value in `outputs_to_state`, `outputs_to_string`, or
 851    `inputs_from_state` has the wrong type.
 852  
 853  #### tool_spec
 854  
 855  ```python
 856  tool_spec: dict[str, Any]
 857  ```
 858  
 859  Return the Tool specification to be used by the Language Model.
 860  
 861  #### warm_up
 862  
 863  ```python
 864  warm_up() -> None
 865  ```
 866  
 867  Prepare the Tool for use.
 868  
 869  Override this method to establish connections to remote services, load models,
 870  or perform other resource-intensive initialization. This method should be idempotent,
 871  as it may be called multiple times.
 872  
 873  #### invoke
 874  
 875  ```python
 876  invoke(**kwargs: Any) -> Any
 877  ```
 878  
 879  Invoke the Tool with the provided keyword arguments.
 880  
 881  #### to_dict
 882  
 883  ```python
 884  to_dict() -> dict[str, Any]
 885  ```
 886  
 887  Serializes the Tool to a dictionary.
 888  
 889  **Returns:**
 890  
 891  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 892  
 893  #### from_dict
 894  
 895  ```python
 896  from_dict(data: dict[str, Any]) -> Tool
 897  ```
 898  
 899  Deserializes the Tool from a dictionary.
 900  
 901  **Parameters:**
 902  
 903  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 904  
 905  **Returns:**
 906  
 907  - <code>Tool</code> – Deserialized Tool.
 908  
 909  ## toolset
 910  
 911  ### Toolset
 912  
 913  A collection of related Tools that can be used and managed as a cohesive unit.
 914  
 915  Toolset serves two main purposes:
 916  
 917  1. Group related tools together:
 918     Toolset allows you to organize related tools into a single collection, making it easier
 919     to manage and use them as a unit in Haystack pipelines.
 920  
 921     Example:
 922  
 923  ```python
 924  from haystack.tools import Tool, Toolset
 925  from haystack.components.tools import ToolInvoker
 926  
 927  # Define math functions
 928  def add_numbers(a: int, b: int) -> int:
 929      return a + b
 930  
 931  def subtract_numbers(a: int, b: int) -> int:
 932      return a - b
 933  
 934  # Create tools with proper schemas
 935  add_tool = Tool(
 936      name="add",
 937      description="Add two numbers",
 938      parameters={
 939          "type": "object",
 940          "properties": {
 941              "a": {"type": "integer"},
 942              "b": {"type": "integer"}
 943          },
 944          "required": ["a", "b"]
 945      },
 946      function=add_numbers
 947  )
 948  
 949  subtract_tool = Tool(
 950      name="subtract",
 951      description="Subtract b from a",
 952      parameters={
 953          "type": "object",
 954          "properties": {
 955              "a": {"type": "integer"},
 956              "b": {"type": "integer"}
 957          },
 958          "required": ["a", "b"]
 959      },
 960      function=subtract_numbers
 961  )
 962  
 963  # Create a toolset with the math tools
 964  math_toolset = Toolset([add_tool, subtract_tool])
 965  
 966  # Use the toolset with a ToolInvoker or ChatGenerator component
 967  invoker = ToolInvoker(tools=math_toolset)
 968  ```
 969  
 970  2. Base class for dynamic tool loading:
 971     By subclassing Toolset, you can create implementations that dynamically load tools
 972     from external sources like OpenAPI URLs, MCP servers, or other resources.
 973  
 974     Example:
 975  
 976  ```python
 977  from haystack.core.serialization import generate_qualified_class_name
 978  from haystack.tools import Tool, Toolset
 979  from haystack.components.tools import ToolInvoker
 980  
 981  class CalculatorToolset(Toolset):
 982      '''A toolset for calculator operations.'''
 983  
 984      def __init__(self) -> None:
 985          tools = self._create_tools()
 986          super().__init__(tools)
 987  
 988      def _create_tools(self):
 989          # These Tool instances are obviously defined statically and for illustration purposes only.
 990          # In a real-world scenario, you would dynamically load tools from an external source here.
 991          tools = []
 992          add_tool = Tool(
 993              name="add",
 994              description="Add two numbers",
 995              parameters={
 996                  "type": "object",
 997                  "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
 998                  "required": ["a", "b"],
 999              },
1000              function=lambda a, b: a + b,
1001          )
1002  
1003          multiply_tool = Tool(
1004              name="multiply",
1005              description="Multiply two numbers",
1006              parameters={
1007                  "type": "object",
1008                  "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
1009                  "required": ["a", "b"],
1010              },
1011              function=lambda a, b: a * b,
1012          )
1013  
1014          tools.append(add_tool)
1015          tools.append(multiply_tool)
1016  
1017          return tools
1018  
1019      def to_dict(self):
1020          return {
1021              "type": generate_qualified_class_name(type(self)),
1022              "data": {},  # no data to serialize as we define the tools dynamically
1023          }
1024  
1025      @classmethod
1026      def from_dict(cls, data):
1027          return cls()  # Recreate the tools dynamically during deserialization
1028  
1029  # Create the dynamic toolset and use it with ToolInvoker
1030  calculator_toolset = CalculatorToolset()
1031  invoker = ToolInvoker(tools=calculator_toolset)
1032  ```
1033  
1034  Toolset implements the collection interface (__iter__, __contains__, __len__, __getitem__),
1035  making it behave like a list of Tools. This makes it compatible with components that expect
1036  iterable tools, such as ToolInvoker or Haystack chat generators.
1037  
1038  When implementing a custom Toolset subclass for dynamic tool loading:
1039  
1040  - Perform the dynamic loading in the __init__ method
1041  - Override to_dict() and from_dict() methods if your tools are defined dynamically
1042  - Serialize endpoint descriptors rather than tool instances if your tools
1043    are loaded from external sources
1044  
1045  #### warm_up
1046  
1047  ```python
1048  warm_up() -> None
1049  ```
1050  
1051  Prepare the Toolset for use.
1052  
1053  By default, this method iterates through and warms up all tools in the Toolset.
1054  Subclasses can override this method to customize initialization behavior, such as:
1055  
1056  - Setting up shared resources (database connections, HTTP sessions) instead of
1057    warming individual tools
1058  - Implementing custom initialization logic for dynamically loaded tools
1059  - Controlling when and how tools are initialized
1060  
1061  For example, a Toolset that manages tools from an external service (like MCPToolset)
1062  might override this to initialize a shared connection rather than warming up
1063  individual tools:
1064  
1065  ```python
1066  class MCPToolset(Toolset):
1067      def warm_up(self) -> None:
1068          # Only warm up the shared MCP connection, not individual tools
1069          self.mcp_connection = establish_connection(self.server_url)
1070  ```
1071  
1072  This method should be idempotent, as it may be called multiple times.
1073  
1074  #### add
1075  
1076  ```python
1077  add(tool: Union[Tool, Toolset]) -> None
1078  ```
1079  
1080  Add a new Tool or merge another Toolset.
1081  
1082  **Parameters:**
1083  
1084  - **tool** (<code>Union\[Tool, Toolset\]</code>) – A Tool instance or another Toolset to add
1085  
1086  **Raises:**
1087  
1088  - <code>ValueError</code> – If adding the tool would result in duplicate tool names
1089  - <code>TypeError</code> – If the provided object is not a Tool or Toolset
1090  
1091  #### to_dict
1092  
1093  ```python
1094  to_dict() -> dict[str, Any]
1095  ```
1096  
1097  Serialize the Toolset to a dictionary.
1098  
1099  **Returns:**
1100  
1101  - <code>dict\[str, Any\]</code> – A dictionary representation of the Toolset
1102  
1103  Note for subclass implementers:
1104  The default implementation is ideal for scenarios where Tool resolution is static. However, if your subclass
1105  of Toolset dynamically resolves Tool instances from external sources—such as an MCP server, OpenAPI URL, or
1106  a local OpenAPI specification—you should consider serializing the endpoint descriptor instead of the Tool
1107  instances themselves. This strategy preserves the dynamic nature of your Toolset and minimizes the overhead
1108  associated with serializing potentially large collections of Tool objects. Moreover, by serializing the
1109  descriptor, you ensure that the deserialization process can accurately reconstruct the Tool instances, even
1110  if they have been modified or removed since the last serialization. Failing to serialize the descriptor may
1111  lead to issues where outdated or incorrect Tool configurations are loaded, potentially causing errors or
1112  unexpected behavior.
1113  
1114  #### from_dict
1115  
1116  ```python
1117  from_dict(data: dict[str, Any]) -> Toolset
1118  ```
1119  
1120  Deserialize a Toolset from a dictionary.
1121  
1122  **Parameters:**
1123  
1124  - **data** (<code>dict\[str, Any\]</code>) – Dictionary representation of the Toolset
1125  
1126  **Returns:**
1127  
1128  - <code>Toolset</code> – A new Toolset instance