routers_api.md
   1  ---
   2  title: "Routers"
   3  id: routers-api
   4  description: "Routers is a group of components that route queries or Documents to other components that can handle them best."
   5  slug: "/routers-api"
   6  ---
   7  
   8  <a id="conditional_router"></a>
   9  
  10  ## Module conditional\_router
  11  
  12  <a id="conditional_router.NoRouteSelectedException"></a>
  13  
  14  ### NoRouteSelectedException
  15  
  16  Exception raised when no route is selected in ConditionalRouter.
  17  
  18  <a id="conditional_router.RouteConditionException"></a>
  19  
  20  ### RouteConditionException
  21  
  22  Exception raised when there is an error parsing or evaluating the condition expression in ConditionalRouter.
  23  
  24  <a id="conditional_router.ConditionalRouter"></a>
  25  
  26  ### ConditionalRouter
  27  
  28  Routes data based on specific conditions.
  29  
  30  You define these conditions in a list of dictionaries called `routes`.
  31  Each dictionary in this list represents a single route. Each route has these four elements:
  32  - `condition`: A Jinja2 string expression that determines if the route is selected.
  33  - `output`: A Jinja2 expression defining the route's output value.
  34  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
  35  - `output_name`: The name you want to use to publish `output`. This name is used to connect
  36  the router to other components in the pipeline.
  37  
  38  ### Usage example
  39  
  40  ```python
  41  from haystack.components.routers import ConditionalRouter
  42  
  43  routes = [
  44      {
  45          "condition": "{{streams|length > 2}}",
  46          "output": "{{streams}}",
  47          "output_name": "enough_streams",
  48          "output_type": list[int],
  49      },
  50      {
  51          "condition": "{{streams|length <= 2}}",
  52          "output": "{{streams}}",
  53          "output_name": "insufficient_streams",
  54          "output_type": list[int],
  55      },
  56  ]
  57  router = ConditionalRouter(routes)
  58  # When 'streams' has more than 2 items, 'enough_streams' output will activate, emitting the list [1, 2, 3]
  59  kwargs = {"streams": [1, 2, 3], "query": "Haystack"}
  60  result = router.run(**kwargs)
  61  assert result == {"enough_streams": [1, 2, 3]}
  62  ```
  63  
  64  In this example, we configure two routes. The first route sends the 'streams' value to 'enough_streams' if the
  65  stream count exceeds two. The second route directs 'streams' to 'insufficient_streams' if there
  66  are two or fewer streams.
  67  
  68  In the pipeline setup, the Router connects to other components using the output names. For example,
  69  'enough_streams' might connect to a component that processes streams, while
  70  'insufficient_streams' might connect to a component that fetches more streams.
  71  
  72  
  73  Here is a pipeline that uses `ConditionalRouter` and routes the fetched `ByteStreams` to
  74  different components depending on the number of streams fetched:
  75  
  76  ```python
  77  from haystack import Pipeline
  78  from haystack.dataclasses import ByteStream
  79  from haystack.components.routers import ConditionalRouter
  80  
  81  routes = [
  82      {"condition": "{{count > 5}}",
  83          "output": "Processing many items",
  84          "output_name": "many_items",
  85          "output_type": str,
  86      },
  87      {"condition": "{{count <= 5}}",
  88          "output": "Processing few items",
  89          "output_name": "few_items",
  90          "output_type": str,
  91      },
  92  ]
  93  
  94  pipe = Pipeline()
  95  pipe.add_component("router", ConditionalRouter(routes))
  96  
  97  # Run with count > 5
  98  result = pipe.run({"router": {"count": 10}})
  99  print(result)
 100  # >> {'router': {'many_items': 'Processing many items'}}
 101  
 102  # Run with count <= 5
 103  result = pipe.run({"router": {"count": 3}})
 104  print(result)
 105  # >> {'router': {'few_items': 'Processing few items'}}
 106  ```
 107  
 108  <a id="conditional_router.ConditionalRouter.__init__"></a>
 109  
 110  #### ConditionalRouter.\_\_init\_\_
 111  
 112  ```python
 113  def __init__(routes: list[Route],
 114               custom_filters: dict[str, Callable] | None = None,
 115               unsafe: bool = False,
 116               validate_output_type: bool = False,
 117               optional_variables: list[str] | None = None)
 118  ```
 119  
 120  Initializes the `ConditionalRouter` with a list of routes detailing the conditions for routing.
 121  
 122  **Arguments**:
 123  
 124  - `routes`: A list of dictionaries, each defining a route.
 125  Each route has these four elements:
 126  - `condition`: A Jinja2 string expression that determines if the route is selected.
 127  - `output`: A Jinja2 expression defining the route's output value.
 128  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
 129  - `output_name`: The name you want to use to publish `output`. This name is used to connect
 130  the router to other components in the pipeline.
 131  - `custom_filters`: A dictionary of custom Jinja2 filters used in the condition expressions.
 132  For example, passing `{"my_filter": my_filter_fcn}` where:
 133  - `my_filter` is the name of the custom filter.
 134  - `my_filter_fcn` is a callable that takes `my_var:str` and returns `my_var[:3]`.
 135    `{{ my_var|my_filter }}` can then be used inside a route condition expression:
 136      `"condition": "{{ my_var|my_filter == 'foo' }}"`.
 137  - `unsafe`: Enable execution of arbitrary code in the Jinja template.
 138  This should only be used if you trust the source of the template as it can be lead to remote code execution.
 139  - `validate_output_type`: Enable validation of routes' output.
 140  If a route output doesn't match the declared type a ValueError is raised running.
 141  - `optional_variables`: A list of variable names that are optional in your route conditions and outputs.
 142  If these variables are not provided at runtime, they will be set to `None`.
 143  This allows you to write routes that can handle missing inputs gracefully without raising errors.
 144  
 145  Example usage with a default fallback route in a Pipeline:
 146  ```python
 147  from haystack import Pipeline
 148  from haystack.components.routers import ConditionalRouter
 149  
 150  routes = [
 151      {
 152          "condition": '{{ path == "rag" }}',
 153          "output": "{{ question }}",
 154          "output_name": "rag_route",
 155          "output_type": str
 156      },
 157      {
 158          "condition": "{{ True }}",  # fallback route
 159          "output": "{{ question }}",
 160          "output_name": "default_route",
 161          "output_type": str
 162      }
 163  ]
 164  
 165  router = ConditionalRouter(routes, optional_variables=["path"])
 166  pipe = Pipeline()
 167  pipe.add_component("router", router)
 168  
 169  # When 'path' is provided in the pipeline:
 170  result = pipe.run(data={"router": {"question": "What?", "path": "rag"}})
 171  assert result["router"] == {"rag_route": "What?"}
 172  
 173  # When 'path' is not provided, fallback route is taken:
 174  result = pipe.run(data={"router": {"question": "What?"}})
 175  assert result["router"] == {"default_route": "What?"}
 176  ```
 177  
 178  This pattern is particularly useful when:
 179  - You want to provide default/fallback behavior when certain inputs are missing
 180  - Some variables are only needed for specific routing conditions
 181  - You're building flexible pipelines where not all inputs are guaranteed to be present
 182  
 183  <a id="conditional_router.ConditionalRouter.to_dict"></a>
 184  
 185  #### ConditionalRouter.to\_dict
 186  
 187  ```python
 188  def to_dict() -> dict[str, Any]
 189  ```
 190  
 191  Serializes the component to a dictionary.
 192  
 193  **Returns**:
 194  
 195  Dictionary with serialized data.
 196  
 197  <a id="conditional_router.ConditionalRouter.from_dict"></a>
 198  
 199  #### ConditionalRouter.from\_dict
 200  
 201  ```python
 202  @classmethod
 203  def from_dict(cls, data: dict[str, Any]) -> "ConditionalRouter"
 204  ```
 205  
 206  Deserializes the component from a dictionary.
 207  
 208  **Arguments**:
 209  
 210  - `data`: The dictionary to deserialize from.
 211  
 212  **Returns**:
 213  
 214  The deserialized component.
 215  
 216  <a id="conditional_router.ConditionalRouter.run"></a>
 217  
 218  #### ConditionalRouter.run
 219  
 220  ```python
 221  def run(**kwargs)
 222  ```
 223  
 224  Executes the routing logic.
 225  
 226  Executes the routing logic by evaluating the specified boolean condition expressions for each route in the
 227  order they are listed. The method directs the flow of data to the output specified in the first route whose
 228  `condition` is True.
 229  
 230  **Arguments**:
 231  
 232  - `kwargs`: All variables used in the `condition` expressed in the routes. When the component is used in a
 233  pipeline, these variables are passed from the previous component's output.
 234  
 235  **Raises**:
 236  
 237  - `NoRouteSelectedException`: If no `condition' in the routes is `True`.
 238  - `RouteConditionException`: If there is an error parsing or evaluating the `condition` expression in the routes.
 239  - `ValueError`: If type validation is enabled and route type doesn't match actual value type.
 240  
 241  **Returns**:
 242  
 243  A dictionary where the key is the `output_name` of the selected route and the value is the `output`
 244  of the selected route.
 245  
 246  <a id="document_length_router"></a>
 247  
 248  ## Module document\_length\_router
 249  
 250  <a id="document_length_router.DocumentLengthRouter"></a>
 251  
 252  ### DocumentLengthRouter
 253  
 254  Categorizes documents based on the length of the `content` field and routes them to the appropriate output.
 255  
 256  A common use case for DocumentLengthRouter is handling documents obtained from PDFs that contain non-text
 257  content, such as scanned pages or images. This component can detect empty or low-content documents and route them to
 258  components that perform OCR, generate captions, or compute image embeddings.
 259  
 260  ### Usage example
 261  
 262  ```python
 263  from haystack.components.routers import DocumentLengthRouter
 264  from haystack.dataclasses import Document
 265  
 266  docs = [
 267      Document(content="Short"),
 268      Document(content="Long document "*20),
 269  ]
 270  
 271  router = DocumentLengthRouter(threshold=10)
 272  
 273  result = router.run(documents=docs)
 274  print(result)
 275  
 276  # {
 277  #     "short_documents": [Document(content="Short", ...)],
 278  #     "long_documents": [Document(content="Long document ...", ...)],
 279  # }
 280  ```
 281  
 282  <a id="document_length_router.DocumentLengthRouter.__init__"></a>
 283  
 284  #### DocumentLengthRouter.\_\_init\_\_
 285  
 286  ```python
 287  def __init__(*, threshold: int = 10) -> None
 288  ```
 289  
 290  Initialize the DocumentLengthRouter component.
 291  
 292  **Arguments**:
 293  
 294  - `threshold`: The threshold for the number of characters in the document `content` field. Documents where `content` is
 295  None or whose character count is less than or equal to the threshold will be routed to the `short_documents`
 296  output. Otherwise, they will be routed to the `long_documents` output.
 297  To route only documents with None content to `short_documents`, set the threshold to a negative number.
 298  
 299  <a id="document_length_router.DocumentLengthRouter.run"></a>
 300  
 301  #### DocumentLengthRouter.run
 302  
 303  ```python
 304  @component.output_types(short_documents=list[Document],
 305                          long_documents=list[Document])
 306  def run(documents: list[Document]) -> dict[str, list[Document]]
 307  ```
 308  
 309  Categorize input documents into groups based on the length of the `content` field.
 310  
 311  **Arguments**:
 312  
 313  - `documents`: A list of documents to be categorized.
 314  
 315  **Returns**:
 316  
 317  A dictionary with the following keys:
 318  - `short_documents`: A list of documents where `content` is None or the length of `content` is less than or
 319     equal to the threshold.
 320  - `long_documents`: A list of documents where the length of `content` is greater than the threshold.
 321  
 322  <a id="document_type_router"></a>
 323  
 324  ## Module document\_type\_router
 325  
 326  <a id="document_type_router.DocumentTypeRouter"></a>
 327  
 328  ### DocumentTypeRouter
 329  
 330  Routes documents by their MIME types.
 331  
 332  DocumentTypeRouter is used to dynamically route documents within a pipeline based on their MIME types.
 333  It supports exact MIME type matches and regex patterns.
 334  
 335  MIME types can be extracted directly from document metadata or inferred from file paths using standard or
 336  user-supplied MIME type mappings.
 337  
 338  ### Usage example
 339  
 340  ```python
 341  from haystack.components.routers import DocumentTypeRouter
 342  from haystack.dataclasses import Document
 343  
 344  docs = [
 345      Document(content="Example text", meta={"file_path": "example.txt"}),
 346      Document(content="Another document", meta={"mime_type": "application/pdf"}),
 347      Document(content="Unknown type")
 348  ]
 349  
 350  router = DocumentTypeRouter(
 351      mime_type_meta_field="mime_type",
 352      file_path_meta_field="file_path",
 353      mime_types=["text/plain", "application/pdf"]
 354  )
 355  
 356  result = router.run(documents=docs)
 357  print(result)
 358  ```
 359  
 360  Expected output:
 361  ```python
 362  {
 363      "text/plain": [Document(...)],
 364      "application/pdf": [Document(...)],
 365      "unclassified": [Document(...)]
 366  }
 367  ```
 368  
 369  <a id="document_type_router.DocumentTypeRouter.__init__"></a>
 370  
 371  #### DocumentTypeRouter.\_\_init\_\_
 372  
 373  ```python
 374  def __init__(*,
 375               mime_types: list[str],
 376               mime_type_meta_field: str | None = None,
 377               file_path_meta_field: str | None = None,
 378               additional_mimetypes: dict[str, str] | None = None) -> None
 379  ```
 380  
 381  Initialize the DocumentTypeRouter component.
 382  
 383  **Arguments**:
 384  
 385  - `mime_types`: A list of MIME types or regex patterns to classify the input documents.
 386  (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 387  - `mime_type_meta_field`: Optional name of the metadata field that holds the MIME type.
 388  - `file_path_meta_field`: Optional name of the metadata field that holds the file path. Used to infer the MIME type if
 389  `mime_type_meta_field` is not provided or missing in a document.
 390  - `additional_mimetypes`: Optional dictionary mapping MIME types to file extensions to enhance or override the standard
 391  `mimetypes` module. Useful when working with uncommon or custom file types.
 392  For example: `{"application/vnd.custom-type": ".custom"}`.
 393  
 394  **Raises**:
 395  
 396  - `ValueError`: If `mime_types` is empty or if both `mime_type_meta_field` and `file_path_meta_field` are
 397  not provided.
 398  
 399  <a id="document_type_router.DocumentTypeRouter.run"></a>
 400  
 401  #### DocumentTypeRouter.run
 402  
 403  ```python
 404  def run(documents: list[Document]) -> dict[str, list[Document]]
 405  ```
 406  
 407  Categorize input documents into groups based on their MIME type.
 408  
 409  MIME types can either be directly available in document metadata or derived from file paths using the
 410  standard Python `mimetypes` module and custom mappings.
 411  
 412  **Arguments**:
 413  
 414  - `documents`: A list of documents to be categorized.
 415  
 416  **Returns**:
 417  
 418  A dictionary where the keys are MIME types (or `"unclassified"`) and the values are lists of documents.
 419  
 420  <a id="file_type_router"></a>
 421  
 422  ## Module file\_type\_router
 423  
 424  <a id="file_type_router.FileTypeRouter"></a>
 425  
 426  ### FileTypeRouter
 427  
 428  Categorizes files or byte streams by their MIME types, helping in context-based routing.
 429  
 430  FileTypeRouter supports both exact MIME type matching and regex patterns.
 431  
 432  For file paths, MIME types come from extensions, while byte streams use metadata.
 433  You can use regex patterns in the `mime_types` parameter to set broad categories
 434  (such as 'audio/*' or 'text/*') or specific types.
 435  MIME types without regex patterns are treated as exact matches.
 436  
 437  ### Usage example
 438  
 439  ```python
 440  from haystack.components.routers import FileTypeRouter
 441  from pathlib import Path
 442  
 443  # For exact MIME type matching
 444  router = FileTypeRouter(mime_types=["text/plain", "application/pdf"])
 445  
 446  # For flexible matching using regex, to handle all audio types
 447  router_with_regex = FileTypeRouter(mime_types=[r"audio/.*", r"text/plain"])
 448  
 449  sources = [Path("file.txt"), Path("document.pdf"), Path("song.mp3")]
 450  print(router.run(sources=sources))
 451  print(router_with_regex.run(sources=sources))
 452  
 453  # Expected output:
 454  # {'text/plain': [
 455  #   PosixPath('file.txt')], 'application/pdf': [PosixPath('document.pdf')], 'unclassified': [PosixPath('song.mp3')
 456  # ]}
 457  # {'audio/.*': [
 458  #   PosixPath('song.mp3')], 'text/plain': [PosixPath('file.txt')], 'unclassified': [PosixPath('document.pdf')
 459  # ]}
 460  ```
 461  
 462  <a id="file_type_router.FileTypeRouter.__init__"></a>
 463  
 464  #### FileTypeRouter.\_\_init\_\_
 465  
 466  ```python
 467  def __init__(mime_types: list[str],
 468               additional_mimetypes: dict[str, str] | None = None,
 469               raise_on_failure: bool = False)
 470  ```
 471  
 472  Initialize the FileTypeRouter component.
 473  
 474  **Arguments**:
 475  
 476  - `mime_types`: A list of MIME types or regex patterns to classify the input files or byte streams.
 477  (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 478  - `additional_mimetypes`: A dictionary containing the MIME type to add to the mimetypes package to prevent unsupported or non-native
 479  packages from being unclassified.
 480  (for example: `{"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx"}`).
 481  - `raise_on_failure`: If True, raises FileNotFoundError when a file path doesn't exist.
 482  If False (default), only emits a warning when a file path doesn't exist.
 483  
 484  <a id="file_type_router.FileTypeRouter.to_dict"></a>
 485  
 486  #### FileTypeRouter.to\_dict
 487  
 488  ```python
 489  def to_dict() -> dict[str, Any]
 490  ```
 491  
 492  Serializes the component to a dictionary.
 493  
 494  **Returns**:
 495  
 496  Dictionary with serialized data.
 497  
 498  <a id="file_type_router.FileTypeRouter.from_dict"></a>
 499  
 500  #### FileTypeRouter.from\_dict
 501  
 502  ```python
 503  @classmethod
 504  def from_dict(cls, data: dict[str, Any]) -> "FileTypeRouter"
 505  ```
 506  
 507  Deserializes the component from a dictionary.
 508  
 509  **Arguments**:
 510  
 511  - `data`: The dictionary to deserialize from.
 512  
 513  **Returns**:
 514  
 515  The deserialized component.
 516  
 517  <a id="file_type_router.FileTypeRouter.run"></a>
 518  
 519  #### FileTypeRouter.run
 520  
 521  ```python
 522  def run(
 523      sources: list[str | Path | ByteStream],
 524      meta: dict[str, Any] | list[dict[str, Any]] | None = None
 525  ) -> dict[str, list[ByteStream | Path]]
 526  ```
 527  
 528  Categorize files or byte streams according to their MIME types.
 529  
 530  **Arguments**:
 531  
 532  - `sources`: A list of file paths or byte streams to categorize.
 533  - `meta`: Optional metadata to attach to the sources.
 534  When provided, the sources are internally converted to ByteStream objects and the metadata is added.
 535  This value can be a list of dictionaries or a single dictionary.
 536  If it's a single dictionary, its content is added to the metadata of all ByteStream objects.
 537  If it's a list, its length must match the number of sources, as they are zipped together.
 538  
 539  **Returns**:
 540  
 541  A dictionary where the keys are MIME types and the values are lists of data sources.
 542  Two extra keys may be returned: `"unclassified"` when a source's MIME type doesn't match any pattern
 543  and `"failed"` when a source cannot be processed (for example, a file path that doesn't exist).
 544  
 545  <a id="llm_messages_router"></a>
 546  
 547  ## Module llm\_messages\_router
 548  
 549  <a id="llm_messages_router.LLMMessagesRouter"></a>
 550  
 551  ### LLMMessagesRouter
 552  
 553  Routes Chat Messages to different connections using a generative Language Model to perform classification.
 554  
 555      This component can be used with general-purpose LLMs and with specialized LLMs for moderation like Llama Guard.
 556  
 557      ### Usage example
 558      ```python
 559      from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
 560      from haystack.components.routers.llm_messages_router import LLMMessagesRouter
 561      from haystack.dataclasses import ChatMessage
 562  
 563      # initialize a Chat Generator with a generative model for moderation
 564      chat_generator = HuggingFaceAPIChatGenerator(
 565          api_type="serverless_inference_api",
 566          api_params={"model": "meta-llama/Llama-Guard-4-12B", "provider": "groq"},
 567      )
 568  
 569      router = LLMMessagesRouter(chat_generator=chat_generator,
 570                                  output_names=["unsafe", "safe"],
 571                                  output_patterns=["unsafe", "safe"])
 572  
 573  
 574      print(router.run([ChatMessage.from_user("How to rob a bank?")]))
 575  
 576      # {
 577      #     'chat_generator_text': 'unsafe
 578  S2',
 579      #     'unsafe': [
 580      #         ChatMessage(
 581      #             _role=<ChatRole.USER: 'user'>,
 582      #             _content=[TextContent(text='How to rob a bank?')],
 583      #             _name=None,
 584      #             _meta={}
 585      #         )
 586      #     ]
 587      # }
 588      ```
 589  
 590  <a id="llm_messages_router.LLMMessagesRouter.__init__"></a>
 591  
 592  #### LLMMessagesRouter.\_\_init\_\_
 593  
 594  ```python
 595  def __init__(chat_generator: ChatGenerator,
 596               output_names: list[str],
 597               output_patterns: list[str],
 598               system_prompt: str | None = None)
 599  ```
 600  
 601  Initialize the LLMMessagesRouter component.
 602  
 603  **Arguments**:
 604  
 605  - `chat_generator`: A ChatGenerator instance which represents the LLM.
 606  - `output_names`: A list of output connection names. These can be used to connect the router to other
 607  components.
 608  - `output_patterns`: A list of regular expressions to be matched against the output of the LLM. Each pattern
 609  corresponds to an output name. Patterns are evaluated in order.
 610  When using moderation models, refer to the model card to understand the expected outputs.
 611  - `system_prompt`: An optional system prompt to customize the behavior of the LLM.
 612  For moderation models, refer to the model card for supported customization options.
 613  
 614  **Raises**:
 615  
 616  - `ValueError`: If output_names and output_patterns are not non-empty lists of the same length.
 617  
 618  <a id="llm_messages_router.LLMMessagesRouter.warm_up"></a>
 619  
 620  #### LLMMessagesRouter.warm\_up
 621  
 622  ```python
 623  def warm_up()
 624  ```
 625  
 626  Warm up the underlying LLM.
 627  
 628  <a id="llm_messages_router.LLMMessagesRouter.run"></a>
 629  
 630  #### LLMMessagesRouter.run
 631  
 632  ```python
 633  def run(messages: list[ChatMessage]) -> dict[str, str | list[ChatMessage]]
 634  ```
 635  
 636  Classify the messages based on LLM output and route them to the appropriate output connection.
 637  
 638  **Arguments**:
 639  
 640  - `messages`: A list of ChatMessages to be routed. Only user and assistant messages are supported.
 641  
 642  **Raises**:
 643  
 644  - `ValueError`: If messages is an empty list or contains messages with unsupported roles.
 645  
 646  **Returns**:
 647  
 648  A dictionary with the following keys:
 649  - "chat_generator_text": The text output of the LLM, useful for debugging.
 650  - "output_names": Each contains the list of messages that matched the corresponding pattern.
 651  - "unmatched": The messages that did not match any of the output patterns.
 652  
 653  <a id="llm_messages_router.LLMMessagesRouter.to_dict"></a>
 654  
 655  #### LLMMessagesRouter.to\_dict
 656  
 657  ```python
 658  def to_dict() -> dict[str, Any]
 659  ```
 660  
 661  Serialize this component to a dictionary.
 662  
 663  **Returns**:
 664  
 665  The serialized component as a dictionary.
 666  
 667  <a id="llm_messages_router.LLMMessagesRouter.from_dict"></a>
 668  
 669  #### LLMMessagesRouter.from\_dict
 670  
 671  ```python
 672  @classmethod
 673  def from_dict(cls, data: dict[str, Any]) -> "LLMMessagesRouter"
 674  ```
 675  
 676  Deserialize this component from a dictionary.
 677  
 678  **Arguments**:
 679  
 680  - `data`: The dictionary representation of this component.
 681  
 682  **Returns**:
 683  
 684  The deserialized component instance.
 685  
 686  <a id="metadata_router"></a>
 687  
 688  ## Module metadata\_router
 689  
 690  <a id="metadata_router.MetadataRouter"></a>
 691  
 692  ### MetadataRouter
 693  
 694  Routes documents or byte streams to different connections based on their metadata fields.
 695  
 696  Specify the routing rules in the `init` method.
 697  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 698  
 699  
 700  ### Usage examples
 701  
 702  **Routing Documents by metadata:**
 703  ```python
 704  from haystack import Document
 705  from haystack.components.routers import MetadataRouter
 706  
 707  docs = [Document(content="Paris is the capital of France.", meta={"language": "en"}),
 708          Document(content="Berlin ist die Haupststadt von Deutschland.", meta={"language": "de"})]
 709  
 710  router = MetadataRouter(rules={"en": {"field": "meta.language", "operator": "==", "value": "en"}})
 711  
 712  print(router.run(documents=docs))
 713  # {'en': [Document(id=..., content: 'Paris is the capital of France.', meta: {'language': 'en'})],
 714  # 'unmatched': [Document(id=..., content: 'Berlin ist die Haupststadt von Deutschland.', meta: {'language': 'de'})]}
 715  ```
 716  
 717  **Routing ByteStreams by metadata:**
 718  ```python
 719  from haystack.dataclasses import ByteStream
 720  from haystack.components.routers import MetadataRouter
 721  
 722  streams = [
 723      ByteStream.from_string("Hello world", meta={"language": "en"}),
 724      ByteStream.from_string("Bonjour le monde", meta={"language": "fr"})
 725  ]
 726  
 727  router = MetadataRouter(
 728      rules={"english": {"field": "meta.language", "operator": "==", "value": "en"}},
 729      output_type=list[ByteStream]
 730  )
 731  
 732  result = router.run(documents=streams)
 733  # {'english': [ByteStream(...)], 'unmatched': [ByteStream(...)]}
 734  ```
 735  
 736  <a id="metadata_router.MetadataRouter.__init__"></a>
 737  
 738  #### MetadataRouter.\_\_init\_\_
 739  
 740  ```python
 741  def __init__(rules: dict[str, dict],
 742               output_type: type = list[Document]) -> None
 743  ```
 744  
 745  Initializes the MetadataRouter component.
 746  
 747  **Arguments**:
 748  
 749  - `rules`: A dictionary defining how to route documents or byte streams to output connections based on their
 750  metadata. Keys are output connection names, and values are dictionaries of
 751  [filtering expressions](https://docs.haystack.deepset.ai/docs/metadata-filtering) in Haystack.
 752  For example:
 753  ```python
 754  {
 755  "edge_1": {
 756      "operator": "AND",
 757      "conditions": [
 758          {"field": "meta.created_at", "operator": ">=", "value": "2023-01-01"},
 759          {"field": "meta.created_at", "operator": "<", "value": "2023-04-01"},
 760      ],
 761  },
 762  "edge_2": {
 763      "operator": "AND",
 764      "conditions": [
 765          {"field": "meta.created_at", "operator": ">=", "value": "2023-04-01"},
 766          {"field": "meta.created_at", "operator": "<", "value": "2023-07-01"},
 767      ],
 768  },
 769  "edge_3": {
 770      "operator": "AND",
 771      "conditions": [
 772          {"field": "meta.created_at", "operator": ">=", "value": "2023-07-01"},
 773          {"field": "meta.created_at", "operator": "<", "value": "2023-10-01"},
 774      ],
 775  },
 776  "edge_4": {
 777      "operator": "AND",
 778      "conditions": [
 779          {"field": "meta.created_at", "operator": ">=", "value": "2023-10-01"},
 780          {"field": "meta.created_at", "operator": "<", "value": "2024-01-01"},
 781      ],
 782  },
 783  }
 784  ```
 785  :param output_type: The type of the output produced. Lists of Documents or ByteStreams can be specified.
 786  
 787  <a id="metadata_router.MetadataRouter.run"></a>
 788  
 789  #### MetadataRouter.run
 790  
 791  ```python
 792  def run(documents: list[Document] | list[ByteStream])
 793  ```
 794  
 795  Routes documents or byte streams to different connections based on their metadata fields.
 796  
 797  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 798  
 799  **Arguments**:
 800  
 801  - `documents`: A list of `Document` or `ByteStream` objects to be routed based on their metadata.
 802  
 803  **Returns**:
 804  
 805  A dictionary where the keys are the names of the output connections (including `"unmatched"`)
 806  and the values are lists of `Document` or `ByteStream` objects that matched the corresponding rules.
 807  
 808  <a id="metadata_router.MetadataRouter.to_dict"></a>
 809  
 810  #### MetadataRouter.to\_dict
 811  
 812  ```python
 813  def to_dict() -> dict[str, Any]
 814  ```
 815  
 816  Serialize this component to a dictionary.
 817  
 818  **Returns**:
 819  
 820  The serialized component as a dictionary.
 821  
 822  <a id="metadata_router.MetadataRouter.from_dict"></a>
 823  
 824  #### MetadataRouter.from\_dict
 825  
 826  ```python
 827  @classmethod
 828  def from_dict(cls, data: dict[str, Any]) -> "MetadataRouter"
 829  ```
 830  
 831  Deserialize this component from a dictionary.
 832  
 833  **Arguments**:
 834  
 835  - `data`: The dictionary representation of this component.
 836  
 837  **Returns**:
 838  
 839  The deserialized component instance.
 840  
 841  <a id="text_language_router"></a>
 842  
 843  ## Module text\_language\_router
 844  
 845  <a id="text_language_router.TextLanguageRouter"></a>
 846  
 847  ### TextLanguageRouter
 848  
 849  Routes text strings to different output connections based on their language.
 850  
 851  Provide a list of languages during initialization. If the document's text doesn't match any of the
 852  specified languages, the metadata value is set to "unmatched".
 853  For routing documents based on their language, use the DocumentLanguageClassifier component,
 854  followed by the MetaDataRouter.
 855  
 856  ### Usage example
 857  
 858  ```python
 859  from haystack import Pipeline, Document
 860  from haystack.components.routers import TextLanguageRouter
 861  from haystack.document_stores.in_memory import InMemoryDocumentStore
 862  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 863  
 864  document_store = InMemoryDocumentStore()
 865  document_store.write_documents([Document(content="Elvis Presley was an American singer and actor.")])
 866  
 867  p = Pipeline()
 868  p.add_component(instance=TextLanguageRouter(languages=["en"]), name="text_language_router")
 869  p.add_component(instance=InMemoryBM25Retriever(document_store=document_store), name="retriever")
 870  p.connect("text_language_router.en", "retriever.query")
 871  
 872  result = p.run({"text_language_router": {"text": "Who was Elvis Presley?"}})
 873  assert result["retriever"]["documents"][0].content == "Elvis Presley was an American singer and actor."
 874  
 875  result = p.run({"text_language_router": {"text": "ένα ελληνικό κείμενο"}})
 876  assert result["text_language_router"]["unmatched"] == "ένα ελληνικό κείμενο"
 877  ```
 878  
 879  <a id="text_language_router.TextLanguageRouter.__init__"></a>
 880  
 881  #### TextLanguageRouter.\_\_init\_\_
 882  
 883  ```python
 884  def __init__(languages: list[str] | None = None)
 885  ```
 886  
 887  Initialize the TextLanguageRouter component.
 888  
 889  **Arguments**:
 890  
 891  - `languages`: A list of ISO language codes.
 892  See the supported languages in [`langdetect` documentation](https://github.com/Mimino666/langdetect#languages).
 893  If not specified, defaults to ["en"].
 894  
 895  <a id="text_language_router.TextLanguageRouter.run"></a>
 896  
 897  #### TextLanguageRouter.run
 898  
 899  ```python
 900  def run(text: str) -> dict[str, str]
 901  ```
 902  
 903  Routes the text strings to different output connections based on their language.
 904  
 905  If the document's text doesn't match any of the specified languages, the metadata value is set to "unmatched".
 906  
 907  **Arguments**:
 908  
 909  - `text`: A text string to route.
 910  
 911  **Raises**:
 912  
 913  - `TypeError`: If the input is not a string.
 914  
 915  **Returns**:
 916  
 917  A dictionary in which the key is the language (or `"unmatched"`),
 918  and the value is the text.
 919  
 920  <a id="transformers_text_router"></a>
 921  
 922  ## Module transformers\_text\_router
 923  
 924  <a id="transformers_text_router.TransformersTextRouter"></a>
 925  
 926  ### TransformersTextRouter
 927  
 928  Routes the text strings to different connections based on a category label.
 929  
 930  The labels are specific to each model and can be found it its description on Hugging Face.
 931  
 932  ### Usage example
 933  
 934  ```python
 935  from haystack.core.pipeline import Pipeline
 936  from haystack.components.routers import TransformersTextRouter
 937  from haystack.components.builders import PromptBuilder
 938  from haystack.components.generators import HuggingFaceLocalGenerator
 939  
 940  p = Pipeline()
 941  p.add_component(
 942      instance=TransformersTextRouter(model="papluca/xlm-roberta-base-language-detection"),
 943      name="text_router"
 944  )
 945  p.add_component(
 946      instance=PromptBuilder(template="Answer the question: {{query}}\nAnswer:"),
 947      name="english_prompt_builder"
 948  )
 949  p.add_component(
 950      instance=PromptBuilder(template="Beantworte die Frage: {{query}}\nAntwort:"),
 951      name="german_prompt_builder"
 952  )
 953  
 954  p.add_component(
 955      instance=HuggingFaceLocalGenerator(model="DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1"),
 956      name="german_llm"
 957  )
 958  p.add_component(
 959      instance=HuggingFaceLocalGenerator(model="microsoft/Phi-3-mini-4k-instruct"),
 960      name="english_llm"
 961  )
 962  
 963  p.connect("text_router.en", "english_prompt_builder.query")
 964  p.connect("text_router.de", "german_prompt_builder.query")
 965  p.connect("english_prompt_builder.prompt", "english_llm.prompt")
 966  p.connect("german_prompt_builder.prompt", "german_llm.prompt")
 967  
 968  # English Example
 969  print(p.run({"text_router": {"text": "What is the capital of Germany?"}}))
 970  
 971  # German Example
 972  print(p.run({"text_router": {"text": "Was ist die Hauptstadt von Deutschland?"}}))
 973  ```
 974  
 975  <a id="transformers_text_router.TransformersTextRouter.__init__"></a>
 976  
 977  #### TransformersTextRouter.\_\_init\_\_
 978  
 979  ```python
 980  def __init__(model: str,
 981               labels: list[str] | None = None,
 982               device: ComponentDevice | None = None,
 983               token: Secret | None = Secret.from_env_var(
 984                   ["HF_API_TOKEN", "HF_TOKEN"], strict=False),
 985               huggingface_pipeline_kwargs: dict[str, Any] | None = None)
 986  ```
 987  
 988  Initializes the TransformersTextRouter component.
 989  
 990  **Arguments**:
 991  
 992  - `model`: The name or path of a Hugging Face model for text classification.
 993  - `labels`: The list of labels. If not provided, the component fetches the labels
 994  from the model configuration file hosted on the Hugging Face Hub using
 995  `transformers.AutoConfig.from_pretrained`.
 996  - `device`: The device for loading the model. If `None`, automatically selects the default device.
 997  If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
 998  - `token`: The API token used to download private models from Hugging Face.
 999  If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
1000  To generate these tokens, run `transformers-cli login`.
1001  - `huggingface_pipeline_kwargs`: A dictionary of keyword arguments for initializing the Hugging Face
1002  text classification pipeline.
1003  
1004  <a id="transformers_text_router.TransformersTextRouter.warm_up"></a>
1005  
1006  #### TransformersTextRouter.warm\_up
1007  
1008  ```python
1009  def warm_up()
1010  ```
1011  
1012  Initializes the component.
1013  
1014  <a id="transformers_text_router.TransformersTextRouter.to_dict"></a>
1015  
1016  #### TransformersTextRouter.to\_dict
1017  
1018  ```python
1019  def to_dict() -> dict[str, Any]
1020  ```
1021  
1022  Serializes the component to a dictionary.
1023  
1024  **Returns**:
1025  
1026  Dictionary with serialized data.
1027  
1028  <a id="transformers_text_router.TransformersTextRouter.from_dict"></a>
1029  
1030  #### TransformersTextRouter.from\_dict
1031  
1032  ```python
1033  @classmethod
1034  def from_dict(cls, data: dict[str, Any]) -> "TransformersTextRouter"
1035  ```
1036  
1037  Deserializes the component from a dictionary.
1038  
1039  **Arguments**:
1040  
1041  - `data`: Dictionary to deserialize from.
1042  
1043  **Returns**:
1044  
1045  Deserialized component.
1046  
1047  <a id="transformers_text_router.TransformersTextRouter.run"></a>
1048  
1049  #### TransformersTextRouter.run
1050  
1051  ```python
1052  def run(text: str) -> dict[str, str]
1053  ```
1054  
1055  Routes the text strings to different connections based on a category label.
1056  
1057  **Arguments**:
1058  
1059  - `text`: A string of text to route.
1060  
1061  **Raises**:
1062  
1063  - `TypeError`: If the input is not a str.
1064  
1065  **Returns**:
1066  
1067  A dictionary with the label as key and the text as value.
1068  
1069  <a id="zero_shot_text_router"></a>
1070  
1071  ## Module zero\_shot\_text\_router
1072  
1073  <a id="zero_shot_text_router.TransformersZeroShotTextRouter"></a>
1074  
1075  ### TransformersZeroShotTextRouter
1076  
1077  Routes the text strings to different connections based on a category label.
1078  
1079  Specify the set of labels for categorization when initializing the component.
1080  
1081  ### Usage example
1082  
1083  ```python
1084  from haystack import Document
1085  from haystack.document_stores.in_memory import InMemoryDocumentStore
1086  from haystack.core.pipeline import Pipeline
1087  from haystack.components.routers import TransformersZeroShotTextRouter
1088  from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
1089  from haystack.components.retrievers import InMemoryEmbeddingRetriever
1090  
1091  document_store = InMemoryDocumentStore()
1092  doc_embedder = SentenceTransformersDocumentEmbedder(model="intfloat/e5-base-v2")
1093  doc_embedder.warm_up()
1094  docs = [
1095      Document(
1096          content="Germany, officially the Federal Republic of Germany, is a country in the western region of "
1097          "Central Europe. The nation's capital and most populous city is Berlin and its main financial centre "
1098          "is Frankfurt; the largest urban area is the Ruhr."
1099      ),
1100      Document(
1101          content="France, officially the French Republic, is a country located primarily in Western Europe. "
1102          "France is a unitary semi-presidential republic with its capital in Paris, the country's largest city "
1103          "and main cultural and commercial centre; other major urban areas include Marseille, Lyon, Toulouse, "
1104          "Lille, Bordeaux, Strasbourg, Nantes and Nice."
1105      )
1106  ]
1107  docs_with_embeddings = doc_embedder.run(docs)
1108  document_store.write_documents(docs_with_embeddings["documents"])
1109  
1110  p = Pipeline()
1111  p.add_component(instance=TransformersZeroShotTextRouter(labels=["passage", "query"]), name="text_router")
1112  p.add_component(
1113      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="passage: "),
1114      name="passage_embedder"
1115  )
1116  p.add_component(
1117      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="query: "),
1118      name="query_embedder"
1119  )
1120  p.add_component(
1121      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1122      name="query_retriever"
1123  )
1124  p.add_component(
1125      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1126      name="passage_retriever"
1127  )
1128  
1129  p.connect("text_router.passage", "passage_embedder.text")
1130  p.connect("passage_embedder.embedding", "passage_retriever.query_embedding")
1131  p.connect("text_router.query", "query_embedder.text")
1132  p.connect("query_embedder.embedding", "query_retriever.query_embedding")
1133  
1134  # Query Example
1135  p.run({"text_router": {"text": "What is the capital of Germany?"}})
1136  
1137  # Passage Example
1138  p.run({
1139      "text_router":{
1140          "text": "The United Kingdom of Great Britain and Northern Ireland, commonly known as the "            "United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the north-western coast of "            "the continental mainland."
1141      }
1142  })
1143  ```
1144  
1145  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.__init__"></a>
1146  
1147  #### TransformersZeroShotTextRouter.\_\_init\_\_
1148  
1149  ```python
1150  def __init__(labels: list[str],
1151               multi_label: bool = False,
1152               model: str = "MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33",
1153               device: ComponentDevice | None = None,
1154               token: Secret | None = Secret.from_env_var(
1155                   ["HF_API_TOKEN", "HF_TOKEN"], strict=False),
1156               huggingface_pipeline_kwargs: dict[str, Any] | None = None)
1157  ```
1158  
1159  Initializes the TransformersZeroShotTextRouter component.
1160  
1161  **Arguments**:
1162  
1163  - `labels`: The set of labels to use for classification. Can be a single label,
1164  a string of comma-separated labels, or a list of labels.
1165  - `multi_label`: Indicates if multiple labels can be true.
1166  If `False`, label scores are normalized so their sum equals 1 for each sequence.
1167  If `True`, the labels are considered independent and probabilities are normalized for each candidate by
1168  doing a softmax of the entailment score vs. the contradiction score.
1169  - `model`: The name or path of a Hugging Face model for zero-shot text classification.
1170  - `device`: The device for loading the model. If `None`, automatically selects the default device.
1171  If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
1172  - `token`: The API token used to download private models from Hugging Face.
1173  If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
1174  To generate these tokens, run `transformers-cli login`.
1175  - `huggingface_pipeline_kwargs`: A dictionary of keyword arguments for initializing the Hugging Face
1176  zero shot text classification.
1177  
1178  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.warm_up"></a>
1179  
1180  #### TransformersZeroShotTextRouter.warm\_up
1181  
1182  ```python
1183  def warm_up()
1184  ```
1185  
1186  Initializes the component.
1187  
1188  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.to_dict"></a>
1189  
1190  #### TransformersZeroShotTextRouter.to\_dict
1191  
1192  ```python
1193  def to_dict() -> dict[str, Any]
1194  ```
1195  
1196  Serializes the component to a dictionary.
1197  
1198  **Returns**:
1199  
1200  Dictionary with serialized data.
1201  
1202  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.from_dict"></a>
1203  
1204  #### TransformersZeroShotTextRouter.from\_dict
1205  
1206  ```python
1207  @classmethod
1208  def from_dict(cls, data: dict[str, Any]) -> "TransformersZeroShotTextRouter"
1209  ```
1210  
1211  Deserializes the component from a dictionary.
1212  
1213  **Arguments**:
1214  
1215  - `data`: Dictionary to deserialize from.
1216  
1217  **Returns**:
1218  
1219  Deserialized component.
1220  
1221  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.run"></a>
1222  
1223  #### TransformersZeroShotTextRouter.run
1224  
1225  ```python
1226  def run(text: str) -> dict[str, str]
1227  ```
1228  
1229  Routes the text strings to different connections based on a category label.
1230  
1231  **Arguments**:
1232  
1233  - `text`: A string of text to route.
1234  
1235  **Raises**:
1236  
1237  - `TypeError`: If the input is not a str.
1238  
1239  **Returns**:
1240  
1241  A dictionary with the label as key and the text as value.
1242