Cradicle Explorer

/ docs-website / reference / haystack-api / routers_api.md
routers_api.md
   1  ---
   2  title: "Routers"
   3  id: routers-api
   4  description: "Routers is a group of components that route queries or Documents to other components that can handle them best."
   5  slug: "/routers-api"
   6  ---
   7  
   8  
   9  ## conditional_router
  10  
  11  ### NoRouteSelectedException
  12  
  13  Bases: <code>Exception</code>
  14  
  15  Exception raised when no route is selected in ConditionalRouter.
  16  
  17  ### RouteConditionException
  18  
  19  Bases: <code>Exception</code>
  20  
  21  Exception raised when there is an error parsing or evaluating the condition expression in ConditionalRouter.
  22  
  23  ### ConditionalRouter
  24  
  25  Routes data based on specific conditions.
  26  
  27  You define these conditions in a list of dictionaries called `routes`.
  28  Each dictionary in this list represents a single route. Each route has these four elements:
  29  
  30  - `condition`: A Jinja2 string expression that determines if the route is selected.
  31  - `output`: A Jinja2 expression defining the route's output value.
  32  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
  33  - `output_name`: The name you want to use to publish `output`. This name is used to connect
  34    the router to other components in the pipeline.
  35  
  36  ### Usage example
  37  
  38  ```python
  39  from haystack.components.routers import ConditionalRouter
  40  
  41  routes = [
  42      {
  43          "condition": "{{streams|length > 2}}",
  44          "output": "{{streams}}",
  45          "output_name": "enough_streams",
  46          "output_type": list[int],
  47      },
  48      {
  49          "condition": "{{streams|length <= 2}}",
  50          "output": "{{streams}}",
  51          "output_name": "insufficient_streams",
  52          "output_type": list[int],
  53      },
  54  ]
  55  router = ConditionalRouter(routes)
  56  # When 'streams' has more than 2 items, 'enough_streams' output will activate, emitting the list [1, 2, 3]
  57  kwargs = {"streams": [1, 2, 3], "query": "Haystack"}
  58  result = router.run(**kwargs)
  59  assert result == {"enough_streams": [1, 2, 3]}
  60  ```
  61  
  62  In this example, we configure two routes. The first route sends the 'streams' value to 'enough_streams' if the
  63  stream count exceeds two. The second route directs 'streams' to 'insufficient_streams' if there
  64  are two or fewer streams.
  65  
  66  In the pipeline setup, the Router connects to other components using the output names. For example,
  67  'enough_streams' might connect to a component that processes streams, while
  68  'insufficient_streams' might connect to a component that fetches more streams.
  69  
  70  Here is a pipeline that uses `ConditionalRouter` and routes the fetched `ByteStreams` to
  71  different components depending on the number of streams fetched:
  72  
  73  ```python
  74  from haystack import Pipeline
  75  from haystack.dataclasses import ByteStream
  76  from haystack.components.routers import ConditionalRouter
  77  
  78  routes = [
  79      {"condition": "{{count > 5}}",
  80          "output": "Processing many items",
  81          "output_name": "many_items",
  82          "output_type": str,
  83      },
  84      {"condition": "{{count <= 5}}",
  85          "output": "Processing few items",
  86          "output_name": "few_items",
  87          "output_type": str,
  88      },
  89  ]
  90  
  91  pipe = Pipeline()
  92  pipe.add_component("router", ConditionalRouter(routes))
  93  
  94  # Run with count > 5
  95  result = pipe.run({"router": {"count": 10}})
  96  print(result)
  97  # >> {'router': {'many_items': 'Processing many items'}}
  98  
  99  # Run with count <= 5
 100  result = pipe.run({"router": {"count": 3}})
 101  print(result)
 102  # >> {'router': {'few_items': 'Processing few items'}}
 103  ```
 104  
 105  #### __init__
 106  
 107  ```python
 108  __init__(
 109      routes: list[Route],
 110      custom_filters: dict[str, Callable] | None = None,
 111      unsafe: bool = False,
 112      validate_output_type: bool = False,
 113      optional_variables: list[str] | None = None,
 114  ) -> None
 115  ```
 116  
 117  Initializes the `ConditionalRouter` with a list of routes detailing the conditions for routing.
 118  
 119  **Parameters:**
 120  
 121  - **routes** (<code>list\[Route\]</code>) – A list of dictionaries, each defining a route.
 122    Each route has these four elements:
 123  - `condition`: A Jinja2 string expression that determines if the route is selected.
 124  - `output`: A Jinja2 expression defining the route's output value.
 125  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
 126  - `output_name`: The name you want to use to publish `output`. This name is used to connect
 127    the router to other components in the pipeline.
 128  - **custom_filters** (<code>dict\[str, Callable\] | None</code>) – A dictionary of custom Jinja2 filters used in the condition expressions.
 129    For example, passing `{"my_filter": my_filter_fcn}` where:
 130  - `my_filter` is the name of the custom filter.
 131  - `my_filter_fcn` is a callable that takes `my_var:str` and returns `my_var[:3]`.
 132    `{{ my_var|my_filter }}` can then be used inside a route condition expression:
 133    `"condition": "{{ my_var|my_filter == 'foo' }}"`.
 134  - **unsafe** (<code>bool</code>) – Enable execution of arbitrary code in the Jinja template.
 135    This should only be used if you trust the source of the template as it can be lead to remote code execution.
 136  - **validate_output_type** (<code>bool</code>) – Enable validation of routes' output.
 137    If a route output doesn't match the declared type a ValueError is raised running.
 138  - **optional_variables** (<code>list\[str\] | None</code>) – A list of variable names that are optional in your route conditions and outputs.
 139    If these variables are not provided at runtime, they will be set to `None`.
 140    This allows you to write routes that can handle missing inputs gracefully without raising errors.
 141  
 142  Example usage with a default fallback route in a Pipeline:
 143  
 144  ```python
 145  from haystack import Pipeline
 146  from haystack.components.routers import ConditionalRouter
 147  
 148  routes = [
 149      {
 150          "condition": '{{ path == "rag" }}',
 151          "output": "{{ question }}",
 152          "output_name": "rag_route",
 153          "output_type": str
 154      },
 155      {
 156          "condition": "{{ True }}",  # fallback route
 157          "output": "{{ question }}",
 158          "output_name": "default_route",
 159          "output_type": str
 160      }
 161  ]
 162  
 163  router = ConditionalRouter(routes, optional_variables=["path"])
 164  pipe = Pipeline()
 165  pipe.add_component("router", router)
 166  
 167  # When 'path' is provided in the pipeline:
 168  result = pipe.run(data={"router": {"question": "What?", "path": "rag"}})
 169  assert result["router"] == {"rag_route": "What?"}
 170  
 171  # When 'path' is not provided, fallback route is taken:
 172  result = pipe.run(data={"router": {"question": "What?"}})
 173  assert result["router"] == {"default_route": "What?"}
 174  ```
 175  
 176  This pattern is particularly useful when:
 177  
 178  - You want to provide default/fallback behavior when certain inputs are missing
 179  - Some variables are only needed for specific routing conditions
 180  - You're building flexible pipelines where not all inputs are guaranteed to be present
 181  
 182  #### to_dict
 183  
 184  ```python
 185  to_dict() -> dict[str, Any]
 186  ```
 187  
 188  Serializes the component to a dictionary.
 189  
 190  **Returns:**
 191  
 192  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 193  
 194  #### from_dict
 195  
 196  ```python
 197  from_dict(data: dict[str, Any]) -> ConditionalRouter
 198  ```
 199  
 200  Deserializes the component from a dictionary.
 201  
 202  **Parameters:**
 203  
 204  - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize from.
 205  
 206  **Returns:**
 207  
 208  - <code>ConditionalRouter</code> – The deserialized component.
 209  
 210  #### run
 211  
 212  ```python
 213  run(**kwargs: Any) -> dict[str, Any]
 214  ```
 215  
 216  Executes the routing logic.
 217  
 218  Executes the routing logic by evaluating the specified boolean condition expressions for each route in the
 219  order they are listed. The method directs the flow of data to the output specified in the first route whose
 220  `condition` is True.
 221  
 222  **Parameters:**
 223  
 224  - **kwargs** (<code>Any</code>) – All variables used in the `condition` expressed in the routes. When the component is used in a
 225    pipeline, these variables are passed from the previous component's output.
 226  
 227  **Returns:**
 228  
 229  - <code>dict\[str, Any\]</code> – A dictionary where the key is the `output_name` of the selected route and the value is the `output`
 230    of the selected route.
 231  
 232  **Raises:**
 233  
 234  - <code>NoRouteSelectedException</code> – If no `condition' in the routes is `True\`.
 235  - <code>RouteConditionException</code> – If there is an error parsing or evaluating the `condition` expression in the routes.
 236  - <code>ValueError</code> – If type validation is enabled and route type doesn't match actual value type.
 237  
 238  ## document_length_router
 239  
 240  ### DocumentLengthRouter
 241  
 242  Categorizes documents based on the length of the `content` field and routes them to the appropriate output.
 243  
 244  A common use case for DocumentLengthRouter is handling documents obtained from PDFs that contain non-text
 245  content, such as scanned pages or images. This component can detect empty or low-content documents and route them to
 246  components that perform OCR, generate captions, or compute image embeddings.
 247  
 248  ### Usage example
 249  
 250  ```python
 251  from haystack.components.routers import DocumentLengthRouter
 252  from haystack.dataclasses import Document
 253  
 254  docs = [
 255      Document(content="Short"),
 256      Document(content="Long document "*20),
 257  ]
 258  
 259  router = DocumentLengthRouter(threshold=10)
 260  
 261  result = router.run(documents=docs)
 262  print(result)
 263  
 264  # {
 265  #     "short_documents": [Document(content="Short", ...)],
 266  #     "long_documents": [Document(content="Long document ...", ...)],
 267  # }
 268  ```
 269  
 270  #### __init__
 271  
 272  ```python
 273  __init__(*, threshold: int = 10) -> None
 274  ```
 275  
 276  Initialize the DocumentLengthRouter component.
 277  
 278  **Parameters:**
 279  
 280  - **threshold** (<code>int</code>) – The threshold for the number of characters in the document `content` field. Documents where `content` is
 281    None or whose character count is less than or equal to the threshold will be routed to the `short_documents`
 282    output. Otherwise, they will be routed to the `long_documents` output.
 283    To route only documents with None content to `short_documents`, set the threshold to a negative number.
 284  
 285  #### run
 286  
 287  ```python
 288  run(documents: list[Document]) -> dict[str, list[Document]]
 289  ```
 290  
 291  Categorize input documents into groups based on the length of the `content` field.
 292  
 293  **Parameters:**
 294  
 295  - **documents** (<code>list\[Document\]</code>) – A list of documents to be categorized.
 296  
 297  **Returns:**
 298  
 299  - <code>dict\[str, list\[Document\]\]</code> – A dictionary with the following keys:
 300  - `short_documents`: A list of documents where `content` is None or the length of `content` is less than or
 301    equal to the threshold.
 302  - `long_documents`: A list of documents where the length of `content` is greater than the threshold.
 303  
 304  ## document_type_router
 305  
 306  ### DocumentTypeRouter
 307  
 308  Routes documents by their MIME types.
 309  
 310  DocumentTypeRouter is used to dynamically route documents within a pipeline based on their MIME types.
 311  It supports exact MIME type matches and regex patterns.
 312  
 313  MIME types can be extracted directly from document metadata or inferred from file paths using standard or
 314  user-supplied MIME type mappings.
 315  
 316  ### Usage example
 317  
 318  ```python
 319  from haystack.components.routers import DocumentTypeRouter
 320  from haystack.dataclasses import Document
 321  
 322  docs = [
 323      Document(content="Example text", meta={"file_path": "example.txt"}),
 324      Document(content="Another document", meta={"mime_type": "application/pdf"}),
 325      Document(content="Unknown type")
 326  ]
 327  
 328  router = DocumentTypeRouter(
 329      mime_type_meta_field="mime_type",
 330      file_path_meta_field="file_path",
 331      mime_types=["text/plain", "application/pdf"]
 332  )
 333  
 334  result = router.run(documents=docs)
 335  print(result)
 336  ```
 337  
 338  Expected output:
 339  
 340  ```python
 341  {
 342      "text/plain": [Document(...)],
 343      "application/pdf": [Document(...)],
 344      "unclassified": [Document(...)]
 345  }
 346  ```
 347  
 348  #### __init__
 349  
 350  ```python
 351  __init__(
 352      *,
 353      mime_types: list[str],
 354      mime_type_meta_field: str | None = None,
 355      file_path_meta_field: str | None = None,
 356      additional_mimetypes: dict[str, str] | None = None
 357  ) -> None
 358  ```
 359  
 360  Initialize the DocumentTypeRouter component.
 361  
 362  **Parameters:**
 363  
 364  - **mime_types** (<code>list\[str\]</code>) – A list of MIME types or regex patterns to classify the input documents.
 365    (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 366  - **mime_type_meta_field** (<code>str | None</code>) – Optional name of the metadata field that holds the MIME type.
 367  - **file_path_meta_field** (<code>str | None</code>) – Optional name of the metadata field that holds the file path. Used to infer the MIME type if
 368    `mime_type_meta_field` is not provided or missing in a document.
 369  - **additional_mimetypes** (<code>dict\[str, str\] | None</code>) – Optional dictionary mapping MIME types to file extensions to enhance or override the standard
 370    `mimetypes` module. Useful when working with uncommon or custom file types.
 371    For example: `{"application/vnd.custom-type": ".custom"}`.
 372  
 373  **Raises:**
 374  
 375  - <code>ValueError</code> – If `mime_types` is empty or if both `mime_type_meta_field` and `file_path_meta_field` are
 376    not provided.
 377  
 378  #### run
 379  
 380  ```python
 381  run(documents: list[Document]) -> dict[str, list[Document]]
 382  ```
 383  
 384  Categorize input documents into groups based on their MIME type.
 385  
 386  MIME types can either be directly available in document metadata or derived from file paths using the
 387  standard Python `mimetypes` module and custom mappings.
 388  
 389  **Parameters:**
 390  
 391  - **documents** (<code>list\[Document\]</code>) – A list of documents to be categorized.
 392  
 393  **Returns:**
 394  
 395  - <code>dict\[str, list\[Document\]\]</code> – A dictionary where the keys are MIME types (or `"unclassified"`) and the values are lists of documents.
 396  
 397  ## file_type_router
 398  
 399  ### FileTypeRouter
 400  
 401  Categorizes files or byte streams by their MIME types, helping in context-based routing.
 402  
 403  FileTypeRouter supports both exact MIME type matching and regex patterns.
 404  
 405  For file paths, MIME types come from extensions, while byte streams use metadata.
 406  You can use regex patterns in the `mime_types` parameter to set broad categories
 407  (such as 'audio/*' or 'text/*') or specific types.
 408  MIME types without regex patterns are treated as exact matches.
 409  
 410  ### Usage example
 411  
 412  ```python
 413  from haystack.components.routers import FileTypeRouter
 414  from pathlib import Path
 415  
 416  # For exact MIME type matching
 417  router = FileTypeRouter(mime_types=["text/plain", "application/pdf"])
 418  
 419  # For flexible matching using regex, to handle all audio types
 420  router_with_regex = FileTypeRouter(mime_types=[r"audio/.*", r"text/plain"])
 421  
 422  sources = [Path("file.txt"), Path("document.pdf"), Path("song.mp3")]
 423  print(router.run(sources=sources))
 424  print(router_with_regex.run(sources=sources))
 425  
 426  # Expected output:
 427  # {'text/plain': [
 428  #   PosixPath('file.txt')], 'application/pdf': [PosixPath('document.pdf')], 'unclassified': [PosixPath('song.mp3')
 429  # ]}
 430  # {'audio/.*': [
 431  #   PosixPath('song.mp3')], 'text/plain': [PosixPath('file.txt')], 'unclassified': [PosixPath('document.pdf')
 432  # ]}
 433  ```
 434  
 435  #### __init__
 436  
 437  ```python
 438  __init__(
 439      mime_types: list[str],
 440      additional_mimetypes: dict[str, str] | None = None,
 441      raise_on_failure: bool = False,
 442  ) -> None
 443  ```
 444  
 445  Initialize the FileTypeRouter component.
 446  
 447  **Parameters:**
 448  
 449  - **mime_types** (<code>list\[str\]</code>) – A list of MIME types or regex patterns to classify the input files or byte streams.
 450    (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 451  - **additional_mimetypes** (<code>dict\[str, str\] | None</code>) – A dictionary containing the MIME type to add to the mimetypes package to prevent unsupported or non-native
 452    packages from being unclassified.
 453    (for example: `{"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx"}`).
 454  - **raise_on_failure** (<code>bool</code>) – If True, raises FileNotFoundError when a file path doesn't exist.
 455    If False (default), only emits a warning when a file path doesn't exist.
 456  
 457  #### to_dict
 458  
 459  ```python
 460  to_dict() -> dict[str, Any]
 461  ```
 462  
 463  Serializes the component to a dictionary.
 464  
 465  **Returns:**
 466  
 467  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 468  
 469  #### from_dict
 470  
 471  ```python
 472  from_dict(data: dict[str, Any]) -> FileTypeRouter
 473  ```
 474  
 475  Deserializes the component from a dictionary.
 476  
 477  **Parameters:**
 478  
 479  - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize from.
 480  
 481  **Returns:**
 482  
 483  - <code>FileTypeRouter</code> – The deserialized component.
 484  
 485  #### run
 486  
 487  ```python
 488  run(
 489      sources: list[str | Path | ByteStream],
 490      meta: dict[str, Any] | list[dict[str, Any]] | None = None,
 491  ) -> dict[str, list[ByteStream | Path]]
 492  ```
 493  
 494  Categorize files or byte streams according to their MIME types.
 495  
 496  **Parameters:**
 497  
 498  - **sources** (<code>list\[str | Path | ByteStream\]</code>) – A list of file paths or byte streams to categorize.
 499  - **meta** (<code>dict\[str, Any\] | list\[dict\[str, Any\]\] | None</code>) – Optional metadata to attach to the sources.
 500    When provided, the sources are internally converted to ByteStream objects and the metadata is added.
 501    This value can be a list of dictionaries or a single dictionary.
 502    If it's a single dictionary, its content is added to the metadata of all ByteStream objects.
 503    If it's a list, its length must match the number of sources, as they are zipped together.
 504  
 505  **Returns:**
 506  
 507  - <code>dict\[str, list\[ByteStream | Path\]\]</code> – A dictionary where the keys are MIME types and the values are lists of data sources.
 508    Two extra keys may be returned: `"unclassified"` when a source's MIME type doesn't match any pattern
 509    and `"failed"` when a source cannot be processed (for example, a file path that doesn't exist).
 510  
 511  **Raises:**
 512  
 513  - <code>TypeError</code> – If a source is not a Path, str, or ByteStream.
 514  
 515  ## llm_messages_router
 516  
 517  ### LLMMessagesRouter
 518  
 519  Routes Chat Messages to different connections using a generative Language Model to perform classification.
 520  
 521  This component can be used with general-purpose LLMs and with specialized LLMs for moderation like Llama Guard.
 522  
 523  ### Usage example
 524  
 525  ```python
 526  from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
 527  from haystack.components.routers.llm_messages_router import LLMMessagesRouter
 528  from haystack.dataclasses import ChatMessage
 529  
 530  # initialize a Chat Generator with a generative model for moderation
 531  chat_generator = HuggingFaceAPIChatGenerator(
 532      api_type="serverless_inference_api",
 533      api_params={"model": "openai/gpt-oss-safeguard-20b", "provider": "groq"},
 534  )
 535  
 536  router = LLMMessagesRouter(chat_generator=chat_generator,
 537                              output_names=["unsafe", "safe"],
 538                              output_patterns=["unsafe", "safe"])
 539  
 540  
 541  print(router.run([ChatMessage.from_user("How to rob a bank?")]))
 542  
 543  # {
 544  #     'chat_generator_text': 'unsafe\nS2',
 545  #     'unsafe': [
 546  #         ChatMessage(
 547  #             _role=<ChatRole.USER: 'user'>,
 548  #             _content=[TextContent(text='How to rob a bank?')],
 549  #             _name=None,
 550  #             _meta={}
 551  #         )
 552  #     ]
 553  # }
 554  ```
 555  
 556  #### __init__
 557  
 558  ```python
 559  __init__(
 560      chat_generator: ChatGenerator,
 561      output_names: list[str],
 562      output_patterns: list[str],
 563      system_prompt: str | None = None,
 564  ) -> None
 565  ```
 566  
 567  Initialize the LLMMessagesRouter component.
 568  
 569  **Parameters:**
 570  
 571  - **chat_generator** (<code>ChatGenerator</code>) – A ChatGenerator instance which represents the LLM.
 572  - **output_names** (<code>list\[str\]</code>) – A list of output connection names. These can be used to connect the router to other
 573    components.
 574  - **output_patterns** (<code>list\[str\]</code>) – A list of regular expressions to be matched against the output of the LLM. Each pattern
 575    corresponds to an output name. Patterns are evaluated in order.
 576    When using moderation models, refer to the model card to understand the expected outputs.
 577  - **system_prompt** (<code>str | None</code>) – An optional system prompt to customize the behavior of the LLM.
 578    For moderation models, refer to the model card for supported customization options.
 579  
 580  **Raises:**
 581  
 582  - <code>ValueError</code> – If output_names and output_patterns are not non-empty lists of the same length.
 583  
 584  #### warm_up
 585  
 586  ```python
 587  warm_up() -> None
 588  ```
 589  
 590  Warm up the underlying LLM.
 591  
 592  #### run
 593  
 594  ```python
 595  run(messages: list[ChatMessage]) -> dict[str, str | list[ChatMessage]]
 596  ```
 597  
 598  Classify the messages based on LLM output and route them to the appropriate output connection.
 599  
 600  **Parameters:**
 601  
 602  - **messages** (<code>list\[ChatMessage\]</code>) – A list of ChatMessages to be routed. Only user and assistant messages are supported.
 603  
 604  **Returns:**
 605  
 606  - <code>dict\[str, str | list\[ChatMessage\]\]</code> – A dictionary with the following keys:
 607  - "chat_generator_text": The text output of the LLM, useful for debugging.
 608  - "output_names": Each contains the list of messages that matched the corresponding pattern.
 609  - "unmatched": The messages that did not match any of the output patterns.
 610  
 611  **Raises:**
 612  
 613  - <code>ValueError</code> – If messages is an empty list or contains messages with unsupported roles.
 614  
 615  #### to_dict
 616  
 617  ```python
 618  to_dict() -> dict[str, Any]
 619  ```
 620  
 621  Serialize this component to a dictionary.
 622  
 623  **Returns:**
 624  
 625  - <code>dict\[str, Any\]</code> – The serialized component as a dictionary.
 626  
 627  #### from_dict
 628  
 629  ```python
 630  from_dict(data: dict[str, Any]) -> LLMMessagesRouter
 631  ```
 632  
 633  Deserialize this component from a dictionary.
 634  
 635  **Parameters:**
 636  
 637  - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of this component.
 638  
 639  **Returns:**
 640  
 641  - <code>LLMMessagesRouter</code> – The deserialized component instance.
 642  
 643  ## metadata_router
 644  
 645  ### MetadataRouter
 646  
 647  Routes documents or byte streams to different connections based on their metadata fields.
 648  
 649  Specify the routing rules in the `init` method.
 650  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 651  
 652  ### Usage examples
 653  
 654  **Routing Documents by metadata:**
 655  
 656  ```python
 657  from haystack import Document
 658  from haystack.components.routers import MetadataRouter
 659  
 660  docs = [Document(content="Paris is the capital of France.", meta={"language": "en"}),
 661          Document(content="Berlin ist die Haupststadt von Deutschland.", meta={"language": "de"})]
 662  
 663  router = MetadataRouter(rules={"en": {"field": "meta.language", "operator": "==", "value": "en"}})
 664  
 665  print(router.run(documents=docs))
 666  # {'en': [Document(id=..., content: 'Paris is the capital of France.', meta: {'language': 'en'})],
 667  # 'unmatched': [Document(id=..., content: 'Berlin ist die Haupststadt von Deutschland.', meta: {'language': 'de'})]}
 668  ```
 669  
 670  **Routing ByteStreams by metadata:**
 671  
 672  ```python
 673  from haystack.dataclasses import ByteStream
 674  from haystack.components.routers import MetadataRouter
 675  
 676  streams = [
 677      ByteStream.from_string("Hello world", meta={"language": "en"}),
 678      ByteStream.from_string("Bonjour le monde", meta={"language": "fr"})
 679  ]
 680  
 681  router = MetadataRouter(
 682      rules={"english": {"field": "meta.language", "operator": "==", "value": "en"}},
 683      output_type=list[ByteStream]
 684  )
 685  
 686  result = router.run(documents=streams)
 687  # {'english': [ByteStream(...)], 'unmatched': [ByteStream(...)]}
 688  ```
 689  
 690  #### __init__
 691  
 692  ```python
 693  __init__(rules: dict[str, dict], output_type: type = list[Document]) -> None
 694  ```
 695  
 696  Initializes the MetadataRouter component.
 697  
 698  **Parameters:**
 699  
 700  - **rules** (<code>dict\[str, dict\]</code>) – A dictionary defining how to route documents or byte streams to output connections based on their
 701    metadata. Keys are output connection names, and values are dictionaries of
 702    [filtering expressions](https://docs.haystack.deepset.ai/docs/metadata-filtering) in Haystack.
 703    For example:
 704  
 705  ```python
 706  {
 707  "edge_1": {
 708      "operator": "AND",
 709      "conditions": [
 710          {"field": "meta.created_at", "operator": ">=", "value": "2023-01-01"},
 711          {"field": "meta.created_at", "operator": "<", "value": "2023-04-01"},
 712      ],
 713  },
 714  "edge_2": {
 715      "operator": "AND",
 716      "conditions": [
 717          {"field": "meta.created_at", "operator": ">=", "value": "2023-04-01"},
 718          {"field": "meta.created_at", "operator": "<", "value": "2023-07-01"},
 719      ],
 720  },
 721  "edge_3": {
 722      "operator": "AND",
 723      "conditions": [
 724          {"field": "meta.created_at", "operator": ">=", "value": "2023-07-01"},
 725          {"field": "meta.created_at", "operator": "<", "value": "2023-10-01"},
 726      ],
 727  },
 728  "edge_4": {
 729      "operator": "AND",
 730      "conditions": [
 731          {"field": "meta.created_at", "operator": ">=", "value": "2023-10-01"},
 732          {"field": "meta.created_at", "operator": "<", "value": "2024-01-01"},
 733      ],
 734  },
 735  }
 736  ```
 737  
 738  :param output_type: The type of the output produced. Lists of Documents or ByteStreams can be specified.
 739  
 740  #### run
 741  
 742  ```python
 743  run(
 744      documents: list[Document] | list[ByteStream],
 745  ) -> dict[str, list[Document] | list[ByteStream]]
 746  ```
 747  
 748  Routes documents or byte streams to different connections based on their metadata fields.
 749  
 750  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 751  
 752  **Parameters:**
 753  
 754  - **documents** (<code>list\[Document\] | list\[ByteStream\]</code>) – A list of `Document` or `ByteStream` objects to be routed based on their metadata.
 755  
 756  **Returns:**
 757  
 758  - <code>dict\[str, list\[Document\] | list\[ByteStream\]\]</code> – A dictionary where the keys are the names of the output connections (including `"unmatched"`)
 759    and the values are lists of `Document` or `ByteStream` objects that matched the corresponding rules.
 760  
 761  #### to_dict
 762  
 763  ```python
 764  to_dict() -> dict[str, Any]
 765  ```
 766  
 767  Serialize this component to a dictionary.
 768  
 769  **Returns:**
 770  
 771  - <code>dict\[str, Any\]</code> – The serialized component as a dictionary.
 772  
 773  #### from_dict
 774  
 775  ```python
 776  from_dict(data: dict[str, Any]) -> MetadataRouter
 777  ```
 778  
 779  Deserialize this component from a dictionary.
 780  
 781  **Parameters:**
 782  
 783  - **data** (<code>dict\[str, Any\]</code>) – The dictionary representation of this component.
 784  
 785  **Returns:**
 786  
 787  - <code>MetadataRouter</code> – The deserialized component instance.
 788  
 789  ## text_language_router
 790  
 791  ### TextLanguageRouter
 792  
 793  Routes text strings to different output connections based on their language.
 794  
 795  Provide a list of languages during initialization. If the document's text doesn't match any of the
 796  specified languages, the metadata value is set to "unmatched".
 797  For routing documents based on their language, use the DocumentLanguageClassifier component,
 798  followed by the MetaDataRouter.
 799  
 800  ### Usage example
 801  
 802  ```python
 803  from haystack import Pipeline, Document
 804  from haystack.components.routers import TextLanguageRouter
 805  from haystack.document_stores.in_memory import InMemoryDocumentStore
 806  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 807  
 808  document_store = InMemoryDocumentStore()
 809  document_store.write_documents([Document(content="Elvis Presley was an American singer and actor.")])
 810  
 811  p = Pipeline()
 812  p.add_component(instance=TextLanguageRouter(languages=["en"]), name="text_language_router")
 813  p.add_component(instance=InMemoryBM25Retriever(document_store=document_store), name="retriever")
 814  p.connect("text_language_router.en", "retriever.query")
 815  
 816  result = p.run({"text_language_router": {"text": "Who was Elvis Presley?"}})
 817  assert result["retriever"]["documents"][0].content == "Elvis Presley was an American singer and actor."
 818  
 819  result = p.run({"text_language_router": {"text": "ένα ελληνικό κείμενο"}})
 820  assert result["text_language_router"]["unmatched"] == "ένα ελληνικό κείμενο"
 821  ```
 822  
 823  #### __init__
 824  
 825  ```python
 826  __init__(languages: list[str] | None = None) -> None
 827  ```
 828  
 829  Initialize the TextLanguageRouter component.
 830  
 831  **Parameters:**
 832  
 833  - **languages** (<code>list\[str\] | None</code>) – A list of ISO language codes.
 834    See the supported languages in [`langdetect` documentation](https://github.com/Mimino666/langdetect#languages).
 835    If not specified, defaults to ["en"].
 836  
 837  #### run
 838  
 839  ```python
 840  run(text: str) -> dict[str, str]
 841  ```
 842  
 843  Routes the text strings to different output connections based on their language.
 844  
 845  If the document's text doesn't match any of the specified languages, the metadata value is set to "unmatched".
 846  
 847  **Parameters:**
 848  
 849  - **text** (<code>str</code>) – A text string to route.
 850  
 851  **Returns:**
 852  
 853  - <code>dict\[str, str\]</code> – A dictionary in which the key is the language (or `"unmatched"`),
 854    and the value is the text.
 855  
 856  **Raises:**
 857  
 858  - <code>TypeError</code> – If the input is not a string.
 859  
 860  ## transformers_text_router
 861  
 862  ### TransformersTextRouter
 863  
 864  Routes the text strings to different connections based on a category label.
 865  
 866  The labels are specific to each model and can be found it its description on Hugging Face.
 867  
 868  ### Usage example
 869  
 870  <!-- test-ignore -->
 871  
 872  ```python
 873  from haystack.core.pipeline import Pipeline
 874  from haystack.components.routers import TransformersTextRouter
 875  from haystack.components.builders import PromptBuilder
 876  from haystack.components.generators import HuggingFaceLocalGenerator
 877  
 878  p = Pipeline()
 879  p.add_component(
 880      instance=TransformersTextRouter(model="papluca/xlm-roberta-base-language-detection"),
 881      name="text_router"
 882  )
 883  p.add_component(
 884      instance=PromptBuilder(template="Answer the question: {{query}}\nAnswer:"),
 885      name="english_prompt_builder"
 886  )
 887  p.add_component(
 888      instance=PromptBuilder(template="Beantworte die Frage: {{query}}\nAntwort:"),
 889      name="german_prompt_builder"
 890  )
 891  
 892  p.add_component(
 893      instance=HuggingFaceLocalGenerator(model="DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1"),
 894      name="german_llm"
 895  )
 896  p.add_component(
 897      instance=HuggingFaceLocalGenerator(model="microsoft/Phi-3-mini-4k-instruct"),
 898      name="english_llm"
 899  )
 900  
 901  p.connect("text_router.en", "english_prompt_builder.query")
 902  p.connect("text_router.de", "german_prompt_builder.query")
 903  p.connect("english_prompt_builder.prompt", "english_llm.prompt")
 904  p.connect("german_prompt_builder.prompt", "german_llm.prompt")
 905  
 906  # English Example
 907  print(p.run({"text_router": {"text": "What is the capital of Germany?"}}))
 908  
 909  # German Example
 910  print(p.run({"text_router": {"text": "Was ist die Hauptstadt von Deutschland?"}}))
 911  ```
 912  
 913  #### __init__
 914  
 915  ```python
 916  __init__(
 917      model: str,
 918      labels: list[str] | None = None,
 919      device: ComponentDevice | None = None,
 920      token: Secret | None = Secret.from_env_var(
 921          ["HF_API_TOKEN", "HF_TOKEN"], strict=False
 922      ),
 923      huggingface_pipeline_kwargs: dict[str, Any] | None = None,
 924  ) -> None
 925  ```
 926  
 927  Initializes the TransformersTextRouter component.
 928  
 929  **Parameters:**
 930  
 931  - **model** (<code>str</code>) – The name or path of a Hugging Face model for text classification.
 932  - **labels** (<code>list\[str\] | None</code>) – The list of labels. If not provided, the component fetches the labels
 933    from the model configuration file hosted on the Hugging Face Hub using
 934    `transformers.AutoConfig.from_pretrained`.
 935  - **device** (<code>ComponentDevice | None</code>) – The device for loading the model. If `None`, automatically selects the default device.
 936    If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
 937  - **token** (<code>Secret | None</code>) – The API token used to download private models from Hugging Face.
 938    If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
 939    To generate these tokens, run `transformers-cli login`.
 940  - **huggingface_pipeline_kwargs** (<code>dict\[str, Any\] | None</code>) – A dictionary of keyword arguments for initializing the Hugging Face
 941    text classification pipeline.
 942  
 943  #### warm_up
 944  
 945  ```python
 946  warm_up() -> None
 947  ```
 948  
 949  Initializes the component.
 950  
 951  #### to_dict
 952  
 953  ```python
 954  to_dict() -> dict[str, Any]
 955  ```
 956  
 957  Serializes the component to a dictionary.
 958  
 959  **Returns:**
 960  
 961  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 962  
 963  #### from_dict
 964  
 965  ```python
 966  from_dict(data: dict[str, Any]) -> TransformersTextRouter
 967  ```
 968  
 969  Deserializes the component from a dictionary.
 970  
 971  **Parameters:**
 972  
 973  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 974  
 975  **Returns:**
 976  
 977  - <code>TransformersTextRouter</code> – Deserialized component.
 978  
 979  #### run
 980  
 981  ```python
 982  run(text: str) -> dict[str, str]
 983  ```
 984  
 985  Routes the text strings to different connections based on a category label.
 986  
 987  **Parameters:**
 988  
 989  - **text** (<code>str</code>) – A string of text to route.
 990  
 991  **Returns:**
 992  
 993  - <code>dict\[str, str\]</code> – A dictionary with the label as key and the text as value.
 994  
 995  **Raises:**
 996  
 997  - <code>TypeError</code> – If the input is not a str.
 998  
 999  ## zero_shot_text_router
1000  
1001  ### TransformersZeroShotTextRouter
1002  
1003  Routes the text strings to different connections based on a category label.
1004  
1005  Specify the set of labels for categorization when initializing the component.
1006  
1007  ### Usage example
1008  
1009  ```python
1010  from haystack import Document
1011  from haystack.document_stores.in_memory import InMemoryDocumentStore
1012  from haystack.core.pipeline import Pipeline
1013  from haystack.components.routers import TransformersZeroShotTextRouter
1014  from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
1015  from haystack.components.retrievers import InMemoryEmbeddingRetriever
1016  
1017  document_store = InMemoryDocumentStore()
1018  doc_embedder = SentenceTransformersDocumentEmbedder(model="intfloat/e5-base-v2")
1019  docs = [
1020      Document(
1021          content="Germany, officially the Federal Republic of Germany, is a country in the western region of "
1022          "Central Europe. The nation's capital and most populous city is Berlin and its main financial centre "
1023          "is Frankfurt; the largest urban area is the Ruhr."
1024      ),
1025      Document(
1026          content="France, officially the French Republic, is a country located primarily in Western Europe. "
1027          "France is a unitary semi-presidential republic with its capital in Paris, the country's largest city "
1028          "and main cultural and commercial centre; other major urban areas include Marseille, Lyon, Toulouse, "
1029          "Lille, Bordeaux, Strasbourg, Nantes and Nice."
1030      )
1031  ]
1032  docs_with_embeddings = doc_embedder.run(docs)
1033  document_store.write_documents(docs_with_embeddings["documents"])
1034  
1035  p = Pipeline()
1036  p.add_component(instance=TransformersZeroShotTextRouter(labels=["passage", "query"]), name="text_router")
1037  p.add_component(
1038      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="passage: "),
1039      name="passage_embedder"
1040  )
1041  p.add_component(
1042      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="query: "),
1043      name="query_embedder"
1044  )
1045  p.add_component(
1046      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1047      name="query_retriever"
1048  )
1049  p.add_component(
1050      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1051      name="passage_retriever"
1052  )
1053  
1054  p.connect("text_router.passage", "passage_embedder.text")
1055  p.connect("passage_embedder.embedding", "passage_retriever.query_embedding")
1056  p.connect("text_router.query", "query_embedder.text")
1057  p.connect("query_embedder.embedding", "query_retriever.query_embedding")
1058  
1059  # Query Example
1060  p.run({"text_router": {"text": "What is the capital of Germany?"}})
1061  
1062  # Passage Example
1063  p.run({
1064      "text_router":{
1065          "text": "The United Kingdom of Great Britain and Northern Ireland, commonly known as the "            "United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the north-western coast of "            "the continental mainland."
1066      }
1067  })
1068  ```
1069  
1070  #### __init__
1071  
1072  ```python
1073  __init__(
1074      labels: list[str],
1075      multi_label: bool = False,
1076      model: str = "MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33",
1077      device: ComponentDevice | None = None,
1078      token: Secret | None = Secret.from_env_var(
1079          ["HF_API_TOKEN", "HF_TOKEN"], strict=False
1080      ),
1081      huggingface_pipeline_kwargs: dict[str, Any] | None = None,
1082  ) -> None
1083  ```
1084  
1085  Initializes the TransformersZeroShotTextRouter component.
1086  
1087  **Parameters:**
1088  
1089  - **labels** (<code>list\[str\]</code>) – The set of labels to use for classification. Can be a single label,
1090    a string of comma-separated labels, or a list of labels.
1091  - **multi_label** (<code>bool</code>) – Indicates if multiple labels can be true.
1092    If `False`, label scores are normalized so their sum equals 1 for each sequence.
1093    If `True`, the labels are considered independent and probabilities are normalized for each candidate by
1094    doing a softmax of the entailment score vs. the contradiction score.
1095  - **model** (<code>str</code>) – The name or path of a Hugging Face model for zero-shot text classification.
1096  - **device** (<code>ComponentDevice | None</code>) – The device for loading the model. If `None`, automatically selects the default device.
1097    If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
1098  - **token** (<code>Secret | None</code>) – The API token used to download private models from Hugging Face.
1099    If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
1100    To generate these tokens, run `transformers-cli login`.
1101  - **huggingface_pipeline_kwargs** (<code>dict\[str, Any\] | None</code>) – A dictionary of keyword arguments for initializing the Hugging Face
1102    zero shot text classification.
1103  
1104  #### warm_up
1105  
1106  ```python
1107  warm_up() -> None
1108  ```
1109  
1110  Initializes the component.
1111  
1112  #### to_dict
1113  
1114  ```python
1115  to_dict() -> dict[str, Any]
1116  ```
1117  
1118  Serializes the component to a dictionary.
1119  
1120  **Returns:**
1121  
1122  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
1123  
1124  #### from_dict
1125  
1126  ```python
1127  from_dict(data: dict[str, Any]) -> TransformersZeroShotTextRouter
1128  ```
1129  
1130  Deserializes the component from a dictionary.
1131  
1132  **Parameters:**
1133  
1134  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
1135  
1136  **Returns:**
1137  
1138  - <code>TransformersZeroShotTextRouter</code> – Deserialized component.
1139  
1140  #### run
1141  
1142  ```python
1143  run(text: str) -> dict[str, str]
1144  ```
1145  
1146  Routes the text strings to different connections based on a category label.
1147  
1148  **Parameters:**
1149  
1150  - **text** (<code>str</code>) – A string of text to route.
1151  
1152  **Returns:**
1153  
1154  - <code>dict\[str, str\]</code> – A dictionary with the label as key and the text as value.
1155  
1156  **Raises:**
1157  
1158  - <code>TypeError</code> – If the input is not a str.