routers_api.md
   1  ---
   2  title: "Routers"
   3  id: routers-api
   4  description: "Routers is a group of components that route queries or Documents to other components that can handle them best."
   5  slug: "/routers-api"
   6  ---
   7  
   8  <a id="conditional_router"></a>
   9  
  10  ## Module conditional\_router
  11  
  12  <a id="conditional_router.NoRouteSelectedException"></a>
  13  
  14  ### NoRouteSelectedException
  15  
  16  Exception raised when no route is selected in ConditionalRouter.
  17  
  18  <a id="conditional_router.RouteConditionException"></a>
  19  
  20  ### RouteConditionException
  21  
  22  Exception raised when there is an error parsing or evaluating the condition expression in ConditionalRouter.
  23  
  24  <a id="conditional_router.ConditionalRouter"></a>
  25  
  26  ### ConditionalRouter
  27  
  28  Routes data based on specific conditions.
  29  
  30  You define these conditions in a list of dictionaries called `routes`.
  31  Each dictionary in this list represents a single route. Each route has these four elements:
  32  - `condition`: A Jinja2 string expression that determines if the route is selected.
  33  - `output`: A Jinja2 expression defining the route's output value.
  34  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
  35  - `output_name`: The name you want to use to publish `output`. This name is used to connect
  36  the router to other components in the pipeline.
  37  
  38  ### Usage example
  39  
  40  ```python
  41  from haystack.components.routers import ConditionalRouter
  42  
  43  routes = [
  44      {
  45          "condition": "{{streams|length > 2}}",
  46          "output": "{{streams}}",
  47          "output_name": "enough_streams",
  48          "output_type": list[int],
  49      },
  50      {
  51          "condition": "{{streams|length <= 2}}",
  52          "output": "{{streams}}",
  53          "output_name": "insufficient_streams",
  54          "output_type": list[int],
  55      },
  56  ]
  57  router = ConditionalRouter(routes)
  58  # When 'streams' has more than 2 items, 'enough_streams' output will activate, emitting the list [1, 2, 3]
  59  kwargs = {"streams": [1, 2, 3], "query": "Haystack"}
  60  result = router.run(**kwargs)
  61  assert result == {"enough_streams": [1, 2, 3]}
  62  ```
  63  
  64  In this example, we configure two routes. The first route sends the 'streams' value to 'enough_streams' if the
  65  stream count exceeds two. The second route directs 'streams' to 'insufficient_streams' if there
  66  are two or fewer streams.
  67  
  68  In the pipeline setup, the Router connects to other components using the output names. For example,
  69  'enough_streams' might connect to a component that processes streams, while
  70  'insufficient_streams' might connect to a component that fetches more streams.
  71  
  72  
  73  Here is a pipeline that uses `ConditionalRouter` and routes the fetched `ByteStreams` to
  74  different components depending on the number of streams fetched:
  75  
  76  ```python
  77  from haystack import Pipeline
  78  from haystack.dataclasses import ByteStream
  79  from haystack.components.routers import ConditionalRouter
  80  
  81  routes = [
  82      {
  83          "condition": "{{streams|length > 2}}",
  84          "output": "{{streams}}",
  85          "output_name": "enough_streams",
  86          "output_type": list[ByteStream],
  87      },
  88      {
  89          "condition": "{{streams|length <= 2}}",
  90          "output": "{{streams}}",
  91          "output_name": "insufficient_streams",
  92          "output_type": list[ByteStream],
  93      },
  94  ]
  95  
  96  pipe = Pipeline()
  97  pipe.add_component("router", router)
  98  ...
  99  pipe.connect("router.enough_streams", "some_component_a.streams")
 100  pipe.connect("router.insufficient_streams", "some_component_b.streams_or_some_other_input")
 101  ...
 102  ```
 103  
 104  <a id="conditional_router.ConditionalRouter.__init__"></a>
 105  
 106  #### ConditionalRouter.\_\_init\_\_
 107  
 108  ```python
 109  def __init__(routes: list[Route],
 110               custom_filters: Optional[dict[str, Callable]] = None,
 111               unsafe: bool = False,
 112               validate_output_type: bool = False,
 113               optional_variables: Optional[list[str]] = None)
 114  ```
 115  
 116  Initializes the `ConditionalRouter` with a list of routes detailing the conditions for routing.
 117  
 118  **Arguments**:
 119  
 120  - `routes`: A list of dictionaries, each defining a route.
 121  Each route has these four elements:
 122  - `condition`: A Jinja2 string expression that determines if the route is selected.
 123  - `output`: A Jinja2 expression defining the route's output value.
 124  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
 125  - `output_name`: The name you want to use to publish `output`. This name is used to connect
 126  the router to other components in the pipeline.
 127  - `custom_filters`: A dictionary of custom Jinja2 filters used in the condition expressions.
 128  For example, passing `{"my_filter": my_filter_fcn}` where:
 129  - `my_filter` is the name of the custom filter.
 130  - `my_filter_fcn` is a callable that takes `my_var:str` and returns `my_var[:3]`.
 131    `{{ my_var|my_filter }}` can then be used inside a route condition expression:
 132      `"condition": "{{ my_var|my_filter == 'foo' }}"`.
 133  - `unsafe`: Enable execution of arbitrary code in the Jinja template.
 134  This should only be used if you trust the source of the template as it can be lead to remote code execution.
 135  - `validate_output_type`: Enable validation of routes' output.
 136  If a route output doesn't match the declared type a ValueError is raised running.
 137  - `optional_variables`: A list of variable names that are optional in your route conditions and outputs.
 138  If these variables are not provided at runtime, they will be set to `None`.
 139  This allows you to write routes that can handle missing inputs gracefully without raising errors.
 140  
 141  Example usage with a default fallback route in a Pipeline:
 142  ```python
 143  from haystack import Pipeline
 144  from haystack.components.routers import ConditionalRouter
 145  
 146  routes = [
 147      {
 148          "condition": '{{ path == "rag" }}',
 149          "output": "{{ question }}",
 150          "output_name": "rag_route",
 151          "output_type": str
 152      },
 153      {
 154          "condition": "{{ True }}",  # fallback route
 155          "output": "{{ question }}",
 156          "output_name": "default_route",
 157          "output_type": str
 158      }
 159  ]
 160  
 161  router = ConditionalRouter(routes, optional_variables=["path"])
 162  pipe = Pipeline()
 163  pipe.add_component("router", router)
 164  
 165  # When 'path' is provided in the pipeline:
 166  result = pipe.run(data={"router": {"question": "What?", "path": "rag"}})
 167  assert result["router"] == {"rag_route": "What?"}
 168  
 169  # When 'path' is not provided, fallback route is taken:
 170  result = pipe.run(data={"router": {"question": "What?"}})
 171  assert result["router"] == {"default_route": "What?"}
 172  ```
 173  
 174  This pattern is particularly useful when:
 175  - You want to provide default/fallback behavior when certain inputs are missing
 176  - Some variables are only needed for specific routing conditions
 177  - You're building flexible pipelines where not all inputs are guaranteed to be present
 178  
 179  <a id="conditional_router.ConditionalRouter.to_dict"></a>
 180  
 181  #### ConditionalRouter.to\_dict
 182  
 183  ```python
 184  def to_dict() -> dict[str, Any]
 185  ```
 186  
 187  Serializes the component to a dictionary.
 188  
 189  **Returns**:
 190  
 191  Dictionary with serialized data.
 192  
 193  <a id="conditional_router.ConditionalRouter.from_dict"></a>
 194  
 195  #### ConditionalRouter.from\_dict
 196  
 197  ```python
 198  @classmethod
 199  def from_dict(cls, data: dict[str, Any]) -> "ConditionalRouter"
 200  ```
 201  
 202  Deserializes the component from a dictionary.
 203  
 204  **Arguments**:
 205  
 206  - `data`: The dictionary to deserialize from.
 207  
 208  **Returns**:
 209  
 210  The deserialized component.
 211  
 212  <a id="conditional_router.ConditionalRouter.run"></a>
 213  
 214  #### ConditionalRouter.run
 215  
 216  ```python
 217  def run(**kwargs)
 218  ```
 219  
 220  Executes the routing logic.
 221  
 222  Executes the routing logic by evaluating the specified boolean condition expressions for each route in the
 223  order they are listed. The method directs the flow of data to the output specified in the first route whose
 224  `condition` is True.
 225  
 226  **Arguments**:
 227  
 228  - `kwargs`: All variables used in the `condition` expressed in the routes. When the component is used in a
 229  pipeline, these variables are passed from the previous component's output.
 230  
 231  **Raises**:
 232  
 233  - `NoRouteSelectedException`: If no `condition' in the routes is `True`.
 234  - `RouteConditionException`: If there is an error parsing or evaluating the `condition` expression in the routes.
 235  - `ValueError`: If type validation is enabled and route type doesn't match actual value type.
 236  
 237  **Returns**:
 238  
 239  A dictionary where the key is the `output_name` of the selected route and the value is the `output`
 240  of the selected route.
 241  
 242  <a id="document_length_router"></a>
 243  
 244  ## Module document\_length\_router
 245  
 246  <a id="document_length_router.DocumentLengthRouter"></a>
 247  
 248  ### DocumentLengthRouter
 249  
 250  Categorizes documents based on the length of the `content` field and routes them to the appropriate output.
 251  
 252  A common use case for DocumentLengthRouter is handling documents obtained from PDFs that contain non-text
 253  content, such as scanned pages or images. This component can detect empty or low-content documents and route them to
 254  components that perform OCR, generate captions, or compute image embeddings.
 255  
 256  ### Usage example
 257  
 258  ```python
 259  from haystack.components.routers import DocumentLengthRouter
 260  from haystack.dataclasses import Document
 261  
 262  docs = [
 263      Document(content="Short"),
 264      Document(content="Long document "*20),
 265  ]
 266  
 267  router = DocumentLengthRouter(threshold=10)
 268  
 269  result = router.run(documents=docs)
 270  print(result)
 271  
 272  # {
 273  #     "short_documents": [Document(content="Short", ...)],
 274  #     "long_documents": [Document(content="Long document ...", ...)],
 275  # }
 276  ```
 277  
 278  <a id="document_length_router.DocumentLengthRouter.__init__"></a>
 279  
 280  #### DocumentLengthRouter.\_\_init\_\_
 281  
 282  ```python
 283  def __init__(*, threshold: int = 10) -> None
 284  ```
 285  
 286  Initialize the DocumentLengthRouter component.
 287  
 288  **Arguments**:
 289  
 290  - `threshold`: The threshold for the number of characters in the document `content` field. Documents where `content` is
 291  None or whose character count is less than or equal to the threshold will be routed to the `short_documents`
 292  output. Otherwise, they will be routed to the `long_documents` output.
 293  To route only documents with None content to `short_documents`, set the threshold to a negative number.
 294  
 295  <a id="document_length_router.DocumentLengthRouter.run"></a>
 296  
 297  #### DocumentLengthRouter.run
 298  
 299  ```python
 300  @component.output_types(short_documents=list[Document],
 301                          long_documents=list[Document])
 302  def run(documents: list[Document]) -> dict[str, list[Document]]
 303  ```
 304  
 305  Categorize input documents into groups based on the length of the `content` field.
 306  
 307  **Arguments**:
 308  
 309  - `documents`: A list of documents to be categorized.
 310  
 311  **Returns**:
 312  
 313  A dictionary with the following keys:
 314  - `short_documents`: A list of documents where `content` is None or the length of `content` is less than or
 315     equal to the threshold.
 316  - `long_documents`: A list of documents where the length of `content` is greater than the threshold.
 317  
 318  <a id="document_type_router"></a>
 319  
 320  ## Module document\_type\_router
 321  
 322  <a id="document_type_router.DocumentTypeRouter"></a>
 323  
 324  ### DocumentTypeRouter
 325  
 326  Routes documents by their MIME types.
 327  
 328  DocumentTypeRouter is used to dynamically route documents within a pipeline based on their MIME types.
 329  It supports exact MIME type matches and regex patterns.
 330  
 331  MIME types can be extracted directly from document metadata or inferred from file paths using standard or
 332  user-supplied MIME type mappings.
 333  
 334  ### Usage example
 335  
 336  ```python
 337  from haystack.components.routers import DocumentTypeRouter
 338  from haystack.dataclasses import Document
 339  
 340  docs = [
 341      Document(content="Example text", meta={"file_path": "example.txt"}),
 342      Document(content="Another document", meta={"mime_type": "application/pdf"}),
 343      Document(content="Unknown type")
 344  ]
 345  
 346  router = DocumentTypeRouter(
 347      mime_type_meta_field="mime_type",
 348      file_path_meta_field="file_path",
 349      mime_types=["text/plain", "application/pdf"]
 350  )
 351  
 352  result = router.run(documents=docs)
 353  print(result)
 354  ```
 355  
 356  Expected output:
 357  ```python
 358  {
 359      "text/plain": [Document(...)],
 360      "application/pdf": [Document(...)],
 361      "unclassified": [Document(...)]
 362  }
 363  ```
 364  
 365  <a id="document_type_router.DocumentTypeRouter.__init__"></a>
 366  
 367  #### DocumentTypeRouter.\_\_init\_\_
 368  
 369  ```python
 370  def __init__(*,
 371               mime_types: list[str],
 372               mime_type_meta_field: Optional[str] = None,
 373               file_path_meta_field: Optional[str] = None,
 374               additional_mimetypes: Optional[dict[str, str]] = None) -> None
 375  ```
 376  
 377  Initialize the DocumentTypeRouter component.
 378  
 379  **Arguments**:
 380  
 381  - `mime_types`: A list of MIME types or regex patterns to classify the input documents.
 382  (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 383  - `mime_type_meta_field`: Optional name of the metadata field that holds the MIME type.
 384  - `file_path_meta_field`: Optional name of the metadata field that holds the file path. Used to infer the MIME type if
 385  `mime_type_meta_field` is not provided or missing in a document.
 386  - `additional_mimetypes`: Optional dictionary mapping MIME types to file extensions to enhance or override the standard
 387  `mimetypes` module. Useful when working with uncommon or custom file types.
 388  For example: `{"application/vnd.custom-type": ".custom"}`.
 389  
 390  **Raises**:
 391  
 392  - `ValueError`: If `mime_types` is empty or if both `mime_type_meta_field` and `file_path_meta_field` are
 393  not provided.
 394  
 395  <a id="document_type_router.DocumentTypeRouter.run"></a>
 396  
 397  #### DocumentTypeRouter.run
 398  
 399  ```python
 400  def run(documents: list[Document]) -> dict[str, list[Document]]
 401  ```
 402  
 403  Categorize input documents into groups based on their MIME type.
 404  
 405  MIME types can either be directly available in document metadata or derived from file paths using the
 406  standard Python `mimetypes` module and custom mappings.
 407  
 408  **Arguments**:
 409  
 410  - `documents`: A list of documents to be categorized.
 411  
 412  **Returns**:
 413  
 414  A dictionary where the keys are MIME types (or `"unclassified"`) and the values are lists of documents.
 415  
 416  <a id="file_type_router"></a>
 417  
 418  ## Module file\_type\_router
 419  
 420  <a id="file_type_router.FileTypeRouter"></a>
 421  
 422  ### FileTypeRouter
 423  
 424  Categorizes files or byte streams by their MIME types, helping in context-based routing.
 425  
 426  FileTypeRouter supports both exact MIME type matching and regex patterns.
 427  
 428  For file paths, MIME types come from extensions, while byte streams use metadata.
 429  You can use regex patterns in the `mime_types` parameter to set broad categories
 430  (such as 'audio/*' or 'text/*') or specific types.
 431  MIME types without regex patterns are treated as exact matches.
 432  
 433  ### Usage example
 434  
 435  ```python
 436  from haystack.components.routers import FileTypeRouter
 437  from pathlib import Path
 438  
 439  # For exact MIME type matching
 440  router = FileTypeRouter(mime_types=["text/plain", "application/pdf"])
 441  
 442  # For flexible matching using regex, to handle all audio types
 443  router_with_regex = FileTypeRouter(mime_types=[r"audio/.*", r"text/plain"])
 444  
 445  sources = [Path("file.txt"), Path("document.pdf"), Path("song.mp3")]
 446  print(router.run(sources=sources))
 447  print(router_with_regex.run(sources=sources))
 448  
 449  # Expected output:
 450  # {'text/plain': [
 451  #   PosixPath('file.txt')], 'application/pdf': [PosixPath('document.pdf')], 'unclassified': [PosixPath('song.mp3')
 452  # ]}
 453  # {'audio/.*': [
 454  #   PosixPath('song.mp3')], 'text/plain': [PosixPath('file.txt')], 'unclassified': [PosixPath('document.pdf')
 455  # ]}
 456  ```
 457  
 458  <a id="file_type_router.FileTypeRouter.__init__"></a>
 459  
 460  #### FileTypeRouter.\_\_init\_\_
 461  
 462  ```python
 463  def __init__(mime_types: list[str],
 464               additional_mimetypes: Optional[dict[str, str]] = None,
 465               raise_on_failure: bool = False)
 466  ```
 467  
 468  Initialize the FileTypeRouter component.
 469  
 470  **Arguments**:
 471  
 472  - `mime_types`: A list of MIME types or regex patterns to classify the input files or byte streams.
 473  (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 474  - `additional_mimetypes`: A dictionary containing the MIME type to add to the mimetypes package to prevent unsupported or non-native
 475  packages from being unclassified.
 476  (for example: `{"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx"}`).
 477  - `raise_on_failure`: If True, raises FileNotFoundError when a file path doesn't exist.
 478  If False (default), only emits a warning when a file path doesn't exist.
 479  
 480  <a id="file_type_router.FileTypeRouter.to_dict"></a>
 481  
 482  #### FileTypeRouter.to\_dict
 483  
 484  ```python
 485  def to_dict() -> dict[str, Any]
 486  ```
 487  
 488  Serializes the component to a dictionary.
 489  
 490  **Returns**:
 491  
 492  Dictionary with serialized data.
 493  
 494  <a id="file_type_router.FileTypeRouter.from_dict"></a>
 495  
 496  #### FileTypeRouter.from\_dict
 497  
 498  ```python
 499  @classmethod
 500  def from_dict(cls, data: dict[str, Any]) -> "FileTypeRouter"
 501  ```
 502  
 503  Deserializes the component from a dictionary.
 504  
 505  **Arguments**:
 506  
 507  - `data`: The dictionary to deserialize from.
 508  
 509  **Returns**:
 510  
 511  The deserialized component.
 512  
 513  <a id="file_type_router.FileTypeRouter.run"></a>
 514  
 515  #### FileTypeRouter.run
 516  
 517  ```python
 518  def run(
 519      sources: list[Union[str, Path, ByteStream]],
 520      meta: Optional[Union[dict[str, Any], list[dict[str, Any]]]] = None
 521  ) -> dict[str, list[Union[ByteStream, Path]]]
 522  ```
 523  
 524  Categorize files or byte streams according to their MIME types.
 525  
 526  **Arguments**:
 527  
 528  - `sources`: A list of file paths or byte streams to categorize.
 529  - `meta`: Optional metadata to attach to the sources.
 530  When provided, the sources are internally converted to ByteStream objects and the metadata is added.
 531  This value can be a list of dictionaries or a single dictionary.
 532  If it's a single dictionary, its content is added to the metadata of all ByteStream objects.
 533  If it's a list, its length must match the number of sources, as they are zipped together.
 534  
 535  **Returns**:
 536  
 537  A dictionary where the keys are MIME types and the values are lists of data sources.
 538  Two extra keys may be returned: `"unclassified"` when a source's MIME type doesn't match any pattern
 539  and `"failed"` when a source cannot be processed (for example, a file path that doesn't exist).
 540  
 541  <a id="llm_messages_router"></a>
 542  
 543  ## Module llm\_messages\_router
 544  
 545  <a id="llm_messages_router.LLMMessagesRouter"></a>
 546  
 547  ### LLMMessagesRouter
 548  
 549  Routes Chat Messages to different connections using a generative Language Model to perform classification.
 550  
 551      This component can be used with general-purpose LLMs and with specialized LLMs for moderation like Llama Guard.
 552  
 553      ### Usage example
 554      ```python
 555      from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
 556      from haystack.components.routers.llm_messages_router import LLMMessagesRouter
 557      from haystack.dataclasses import ChatMessage
 558  
 559      # initialize a Chat Generator with a generative model for moderation
 560      chat_generator = HuggingFaceAPIChatGenerator(
 561          api_type="serverless_inference_api",
 562          api_params={"model": "meta-llama/Llama-Guard-4-12B", "provider": "groq"},
 563      )
 564  
 565      router = LLMMessagesRouter(chat_generator=chat_generator,
 566                                  output_names=["unsafe", "safe"],
 567                                  output_patterns=["unsafe", "safe"])
 568  
 569  
 570      print(router.run([ChatMessage.from_user("How to rob a bank?")]))
 571  
 572      # {
 573      #     'chat_generator_text': 'unsafe
 574  S2',
 575      #     'unsafe': [
 576      #         ChatMessage(
 577      #             _role=<ChatRole.USER: 'user'>,
 578      #             _content=[TextContent(text='How to rob a bank?')],
 579      #             _name=None,
 580      #             _meta={}
 581      #         )
 582      #     ]
 583      # }
 584      ```
 585  
 586  <a id="llm_messages_router.LLMMessagesRouter.__init__"></a>
 587  
 588  #### LLMMessagesRouter.\_\_init\_\_
 589  
 590  ```python
 591  def __init__(chat_generator: ChatGenerator,
 592               output_names: list[str],
 593               output_patterns: list[str],
 594               system_prompt: Optional[str] = None)
 595  ```
 596  
 597  Initialize the LLMMessagesRouter component.
 598  
 599  **Arguments**:
 600  
 601  - `chat_generator`: A ChatGenerator instance which represents the LLM.
 602  - `output_names`: A list of output connection names. These can be used to connect the router to other
 603  components.
 604  - `output_patterns`: A list of regular expressions to be matched against the output of the LLM. Each pattern
 605  corresponds to an output name. Patterns are evaluated in order.
 606  When using moderation models, refer to the model card to understand the expected outputs.
 607  - `system_prompt`: An optional system prompt to customize the behavior of the LLM.
 608  For moderation models, refer to the model card for supported customization options.
 609  
 610  **Raises**:
 611  
 612  - `ValueError`: If output_names and output_patterns are not non-empty lists of the same length.
 613  
 614  <a id="llm_messages_router.LLMMessagesRouter.warm_up"></a>
 615  
 616  #### LLMMessagesRouter.warm\_up
 617  
 618  ```python
 619  def warm_up()
 620  ```
 621  
 622  Warm up the underlying LLM.
 623  
 624  <a id="llm_messages_router.LLMMessagesRouter.run"></a>
 625  
 626  #### LLMMessagesRouter.run
 627  
 628  ```python
 629  def run(messages: list[ChatMessage]
 630          ) -> dict[str, Union[str, list[ChatMessage]]]
 631  ```
 632  
 633  Classify the messages based on LLM output and route them to the appropriate output connection.
 634  
 635  **Arguments**:
 636  
 637  - `messages`: A list of ChatMessages to be routed. Only user and assistant messages are supported.
 638  
 639  **Raises**:
 640  
 641  - `ValueError`: If messages is an empty list or contains messages with unsupported roles.
 642  
 643  **Returns**:
 644  
 645  A dictionary with the following keys:
 646  - "chat_generator_text": The text output of the LLM, useful for debugging.
 647  - "output_names": Each contains the list of messages that matched the corresponding pattern.
 648  - "unmatched": The messages that did not match any of the output patterns.
 649  
 650  <a id="llm_messages_router.LLMMessagesRouter.to_dict"></a>
 651  
 652  #### LLMMessagesRouter.to\_dict
 653  
 654  ```python
 655  def to_dict() -> dict[str, Any]
 656  ```
 657  
 658  Serialize this component to a dictionary.
 659  
 660  **Returns**:
 661  
 662  The serialized component as a dictionary.
 663  
 664  <a id="llm_messages_router.LLMMessagesRouter.from_dict"></a>
 665  
 666  #### LLMMessagesRouter.from\_dict
 667  
 668  ```python
 669  @classmethod
 670  def from_dict(cls, data: dict[str, Any]) -> "LLMMessagesRouter"
 671  ```
 672  
 673  Deserialize this component from a dictionary.
 674  
 675  **Arguments**:
 676  
 677  - `data`: The dictionary representation of this component.
 678  
 679  **Returns**:
 680  
 681  The deserialized component instance.
 682  
 683  <a id="metadata_router"></a>
 684  
 685  ## Module metadata\_router
 686  
 687  <a id="metadata_router.MetadataRouter"></a>
 688  
 689  ### MetadataRouter
 690  
 691  Routes documents or byte streams to different connections based on their metadata fields.
 692  
 693  Specify the routing rules in the `init` method.
 694  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 695  
 696  
 697  ### Usage examples
 698  
 699  **Routing Documents by metadata:**
 700  ```python
 701  from haystack import Document
 702  from haystack.components.routers import MetadataRouter
 703  
 704  docs = [Document(content="Paris is the capital of France.", meta={"language": "en"}),
 705          Document(content="Berlin ist die Haupststadt von Deutschland.", meta={"language": "de"})]
 706  
 707  router = MetadataRouter(rules={"en": {"field": "meta.language", "operator": "==", "value": "en"}})
 708  
 709  print(router.run(documents=docs))
 710  # {'en': [Document(id=..., content: 'Paris is the capital of France.', meta: {'language': 'en'})],
 711  # 'unmatched': [Document(id=..., content: 'Berlin ist die Haupststadt von Deutschland.', meta: {'language': 'de'})]}
 712  ```
 713  
 714  **Routing ByteStreams by metadata:**
 715  ```python
 716  from haystack.dataclasses import ByteStream
 717  from haystack.components.routers import MetadataRouter
 718  
 719  streams = [
 720      ByteStream.from_string("Hello world", meta={"language": "en"}),
 721      ByteStream.from_string("Bonjour le monde", meta={"language": "fr"})
 722  ]
 723  
 724  router = MetadataRouter(
 725      rules={"english": {"field": "meta.language", "operator": "==", "value": "en"}},
 726      output_type=list[ByteStream]
 727  )
 728  
 729  result = router.run(documents=streams)
 730  # {'english': [ByteStream(...)], 'unmatched': [ByteStream(...)]}
 731  ```
 732  
 733  <a id="metadata_router.MetadataRouter.__init__"></a>
 734  
 735  #### MetadataRouter.\_\_init\_\_
 736  
 737  ```python
 738  def __init__(rules: dict[str, dict],
 739               output_type: type = list[Document]) -> None
 740  ```
 741  
 742  Initializes the MetadataRouter component.
 743  
 744  **Arguments**:
 745  
 746  - `rules`: A dictionary defining how to route documents or byte streams to output connections based on their
 747  metadata. Keys are output connection names, and values are dictionaries of
 748  [filtering expressions](https://docs.haystack.deepset.ai/docs/metadata-filtering) in Haystack.
 749  For example:
 750  ```python
 751  {
 752  "edge_1": {
 753      "operator": "AND",
 754      "conditions": [
 755          {"field": "meta.created_at", "operator": ">=", "value": "2023-01-01"},
 756          {"field": "meta.created_at", "operator": "<", "value": "2023-04-01"},
 757      ],
 758  },
 759  "edge_2": {
 760      "operator": "AND",
 761      "conditions": [
 762          {"field": "meta.created_at", "operator": ">=", "value": "2023-04-01"},
 763          {"field": "meta.created_at", "operator": "<", "value": "2023-07-01"},
 764      ],
 765  },
 766  "edge_3": {
 767      "operator": "AND",
 768      "conditions": [
 769          {"field": "meta.created_at", "operator": ">=", "value": "2023-07-01"},
 770          {"field": "meta.created_at", "operator": "<", "value": "2023-10-01"},
 771      ],
 772  },
 773  "edge_4": {
 774      "operator": "AND",
 775      "conditions": [
 776          {"field": "meta.created_at", "operator": ">=", "value": "2023-10-01"},
 777          {"field": "meta.created_at", "operator": "<", "value": "2024-01-01"},
 778      ],
 779  },
 780  }
 781  ```
 782  :param output_type: The type of the output produced. Lists of Documents or ByteStreams can be specified.
 783  
 784  <a id="metadata_router.MetadataRouter.run"></a>
 785  
 786  #### MetadataRouter.run
 787  
 788  ```python
 789  def run(documents: Union[list[Document], list[ByteStream]])
 790  ```
 791  
 792  Routes documents or byte streams to different connections based on their metadata fields.
 793  
 794  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 795  
 796  **Arguments**:
 797  
 798  - `documents`: A list of `Document` or `ByteStream` objects to be routed based on their metadata.
 799  
 800  **Returns**:
 801  
 802  A dictionary where the keys are the names of the output connections (including `"unmatched"`)
 803  and the values are lists of `Document` or `ByteStream` objects that matched the corresponding rules.
 804  
 805  <a id="metadata_router.MetadataRouter.to_dict"></a>
 806  
 807  #### MetadataRouter.to\_dict
 808  
 809  ```python
 810  def to_dict() -> dict[str, Any]
 811  ```
 812  
 813  Serialize this component to a dictionary.
 814  
 815  **Returns**:
 816  
 817  The serialized component as a dictionary.
 818  
 819  <a id="metadata_router.MetadataRouter.from_dict"></a>
 820  
 821  #### MetadataRouter.from\_dict
 822  
 823  ```python
 824  @classmethod
 825  def from_dict(cls, data: dict[str, Any]) -> "MetadataRouter"
 826  ```
 827  
 828  Deserialize this component from a dictionary.
 829  
 830  **Arguments**:
 831  
 832  - `data`: The dictionary representation of this component.
 833  
 834  **Returns**:
 835  
 836  The deserialized component instance.
 837  
 838  <a id="text_language_router"></a>
 839  
 840  ## Module text\_language\_router
 841  
 842  <a id="text_language_router.TextLanguageRouter"></a>
 843  
 844  ### TextLanguageRouter
 845  
 846  Routes text strings to different output connections based on their language.
 847  
 848  Provide a list of languages during initialization. If the document's text doesn't match any of the
 849  specified languages, the metadata value is set to "unmatched".
 850  For routing documents based on their language, use the DocumentLanguageClassifier component,
 851  followed by the MetaDataRouter.
 852  
 853  ### Usage example
 854  
 855  ```python
 856  from haystack import Pipeline, Document
 857  from haystack.components.routers import TextLanguageRouter
 858  from haystack.document_stores.in_memory import InMemoryDocumentStore
 859  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 860  
 861  document_store = InMemoryDocumentStore()
 862  document_store.write_documents([Document(content="Elvis Presley was an American singer and actor.")])
 863  
 864  p = Pipeline()
 865  p.add_component(instance=TextLanguageRouter(languages=["en"]), name="text_language_router")
 866  p.add_component(instance=InMemoryBM25Retriever(document_store=document_store), name="retriever")
 867  p.connect("text_language_router.en", "retriever.query")
 868  
 869  result = p.run({"text_language_router": {"text": "Who was Elvis Presley?"}})
 870  assert result["retriever"]["documents"][0].content == "Elvis Presley was an American singer and actor."
 871  
 872  result = p.run({"text_language_router": {"text": "ένα ελληνικό κείμενο"}})
 873  assert result["text_language_router"]["unmatched"] == "ένα ελληνικό κείμενο"
 874  ```
 875  
 876  <a id="text_language_router.TextLanguageRouter.__init__"></a>
 877  
 878  #### TextLanguageRouter.\_\_init\_\_
 879  
 880  ```python
 881  def __init__(languages: Optional[list[str]] = None)
 882  ```
 883  
 884  Initialize the TextLanguageRouter component.
 885  
 886  **Arguments**:
 887  
 888  - `languages`: A list of ISO language codes.
 889  See the supported languages in [`langdetect` documentation](https://github.com/Mimino666/langdetect#languages).
 890  If not specified, defaults to ["en"].
 891  
 892  <a id="text_language_router.TextLanguageRouter.run"></a>
 893  
 894  #### TextLanguageRouter.run
 895  
 896  ```python
 897  def run(text: str) -> dict[str, str]
 898  ```
 899  
 900  Routes the text strings to different output connections based on their language.
 901  
 902  If the document's text doesn't match any of the specified languages, the metadata value is set to "unmatched".
 903  
 904  **Arguments**:
 905  
 906  - `text`: A text string to route.
 907  
 908  **Raises**:
 909  
 910  - `TypeError`: If the input is not a string.
 911  
 912  **Returns**:
 913  
 914  A dictionary in which the key is the language (or `"unmatched"`),
 915  and the value is the text.
 916  
 917  <a id="transformers_text_router"></a>
 918  
 919  ## Module transformers\_text\_router
 920  
 921  <a id="transformers_text_router.TransformersTextRouter"></a>
 922  
 923  ### TransformersTextRouter
 924  
 925  Routes the text strings to different connections based on a category label.
 926  
 927  The labels are specific to each model and can be found it its description on Hugging Face.
 928  
 929  ### Usage example
 930  
 931  ```python
 932  from haystack.core.pipeline import Pipeline
 933  from haystack.components.routers import TransformersTextRouter
 934  from haystack.components.builders import PromptBuilder
 935  from haystack.components.generators import HuggingFaceLocalGenerator
 936  
 937  p = Pipeline()
 938  p.add_component(
 939      instance=TransformersTextRouter(model="papluca/xlm-roberta-base-language-detection"),
 940      name="text_router"
 941  )
 942  p.add_component(
 943      instance=PromptBuilder(template="Answer the question: {{query}}\nAnswer:"),
 944      name="english_prompt_builder"
 945  )
 946  p.add_component(
 947      instance=PromptBuilder(template="Beantworte die Frage: {{query}}\nAntwort:"),
 948      name="german_prompt_builder"
 949  )
 950  
 951  p.add_component(
 952      instance=HuggingFaceLocalGenerator(model="DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1"),
 953      name="german_llm"
 954  )
 955  p.add_component(
 956      instance=HuggingFaceLocalGenerator(model="microsoft/Phi-3-mini-4k-instruct"),
 957      name="english_llm"
 958  )
 959  
 960  p.connect("text_router.en", "english_prompt_builder.query")
 961  p.connect("text_router.de", "german_prompt_builder.query")
 962  p.connect("english_prompt_builder.prompt", "english_llm.prompt")
 963  p.connect("german_prompt_builder.prompt", "german_llm.prompt")
 964  
 965  # English Example
 966  print(p.run({"text_router": {"text": "What is the capital of Germany?"}}))
 967  
 968  # German Example
 969  print(p.run({"text_router": {"text": "Was ist die Hauptstadt von Deutschland?"}}))
 970  ```
 971  
 972  <a id="transformers_text_router.TransformersTextRouter.__init__"></a>
 973  
 974  #### TransformersTextRouter.\_\_init\_\_
 975  
 976  ```python
 977  def __init__(model: str,
 978               labels: Optional[list[str]] = None,
 979               device: Optional[ComponentDevice] = None,
 980               token: Optional[Secret] = Secret.from_env_var(
 981                   ["HF_API_TOKEN", "HF_TOKEN"], strict=False),
 982               huggingface_pipeline_kwargs: Optional[dict[str, Any]] = None)
 983  ```
 984  
 985  Initializes the TransformersTextRouter component.
 986  
 987  **Arguments**:
 988  
 989  - `model`: The name or path of a Hugging Face model for text classification.
 990  - `labels`: The list of labels. If not provided, the component fetches the labels
 991  from the model configuration file hosted on the Hugging Face Hub using
 992  `transformers.AutoConfig.from_pretrained`.
 993  - `device`: The device for loading the model. If `None`, automatically selects the default device.
 994  If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
 995  - `token`: The API token used to download private models from Hugging Face.
 996  If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
 997  To generate these tokens, run `transformers-cli login`.
 998  - `huggingface_pipeline_kwargs`: A dictionary of keyword arguments for initializing the Hugging Face
 999  text classification pipeline.
1000  
1001  <a id="transformers_text_router.TransformersTextRouter.warm_up"></a>
1002  
1003  #### TransformersTextRouter.warm\_up
1004  
1005  ```python
1006  def warm_up()
1007  ```
1008  
1009  Initializes the component.
1010  
1011  <a id="transformers_text_router.TransformersTextRouter.to_dict"></a>
1012  
1013  #### TransformersTextRouter.to\_dict
1014  
1015  ```python
1016  def to_dict() -> dict[str, Any]
1017  ```
1018  
1019  Serializes the component to a dictionary.
1020  
1021  **Returns**:
1022  
1023  Dictionary with serialized data.
1024  
1025  <a id="transformers_text_router.TransformersTextRouter.from_dict"></a>
1026  
1027  #### TransformersTextRouter.from\_dict
1028  
1029  ```python
1030  @classmethod
1031  def from_dict(cls, data: dict[str, Any]) -> "TransformersTextRouter"
1032  ```
1033  
1034  Deserializes the component from a dictionary.
1035  
1036  **Arguments**:
1037  
1038  - `data`: Dictionary to deserialize from.
1039  
1040  **Returns**:
1041  
1042  Deserialized component.
1043  
1044  <a id="transformers_text_router.TransformersTextRouter.run"></a>
1045  
1046  #### TransformersTextRouter.run
1047  
1048  ```python
1049  def run(text: str) -> dict[str, str]
1050  ```
1051  
1052  Routes the text strings to different connections based on a category label.
1053  
1054  **Arguments**:
1055  
1056  - `text`: A string of text to route.
1057  
1058  **Raises**:
1059  
1060  - `TypeError`: If the input is not a str.
1061  - `RuntimeError`: If the pipeline has not been loaded because warm_up() was not called before.
1062  
1063  **Returns**:
1064  
1065  A dictionary with the label as key and the text as value.
1066  
1067  <a id="zero_shot_text_router"></a>
1068  
1069  ## Module zero\_shot\_text\_router
1070  
1071  <a id="zero_shot_text_router.TransformersZeroShotTextRouter"></a>
1072  
1073  ### TransformersZeroShotTextRouter
1074  
1075  Routes the text strings to different connections based on a category label.
1076  
1077  Specify the set of labels for categorization when initializing the component.
1078  
1079  ### Usage example
1080  
1081  ```python
1082  from haystack import Document
1083  from haystack.document_stores.in_memory import InMemoryDocumentStore
1084  from haystack.core.pipeline import Pipeline
1085  from haystack.components.routers import TransformersZeroShotTextRouter
1086  from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
1087  from haystack.components.retrievers import InMemoryEmbeddingRetriever
1088  
1089  document_store = InMemoryDocumentStore()
1090  doc_embedder = SentenceTransformersDocumentEmbedder(model="intfloat/e5-base-v2")
1091  doc_embedder.warm_up()
1092  docs = [
1093      Document(
1094          content="Germany, officially the Federal Republic of Germany, is a country in the western region of "
1095          "Central Europe. The nation's capital and most populous city is Berlin and its main financial centre "
1096          "is Frankfurt; the largest urban area is the Ruhr."
1097      ),
1098      Document(
1099          content="France, officially the French Republic, is a country located primarily in Western Europe. "
1100          "France is a unitary semi-presidential republic with its capital in Paris, the country's largest city "
1101          "and main cultural and commercial centre; other major urban areas include Marseille, Lyon, Toulouse, "
1102          "Lille, Bordeaux, Strasbourg, Nantes and Nice."
1103      )
1104  ]
1105  docs_with_embeddings = doc_embedder.run(docs)
1106  document_store.write_documents(docs_with_embeddings["documents"])
1107  
1108  p = Pipeline()
1109  p.add_component(instance=TransformersZeroShotTextRouter(labels=["passage", "query"]), name="text_router")
1110  p.add_component(
1111      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="passage: "),
1112      name="passage_embedder"
1113  )
1114  p.add_component(
1115      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="query: "),
1116      name="query_embedder"
1117  )
1118  p.add_component(
1119      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1120      name="query_retriever"
1121  )
1122  p.add_component(
1123      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1124      name="passage_retriever"
1125  )
1126  
1127  p.connect("text_router.passage", "passage_embedder.text")
1128  p.connect("passage_embedder.embedding", "passage_retriever.query_embedding")
1129  p.connect("text_router.query", "query_embedder.text")
1130  p.connect("query_embedder.embedding", "query_retriever.query_embedding")
1131  
1132  # Query Example
1133  p.run({"text_router": {"text": "What is the capital of Germany?"}})
1134  
1135  # Passage Example
1136  p.run({
1137      "text_router":{
1138          "text": "The United Kingdom of Great Britain and Northern Ireland, commonly known as the "            "United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the north-western coast of "            "the continental mainland."
1139      }
1140  })
1141  ```
1142  
1143  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.__init__"></a>
1144  
1145  #### TransformersZeroShotTextRouter.\_\_init\_\_
1146  
1147  ```python
1148  def __init__(labels: list[str],
1149               multi_label: bool = False,
1150               model: str = "MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33",
1151               device: Optional[ComponentDevice] = None,
1152               token: Optional[Secret] = Secret.from_env_var(
1153                   ["HF_API_TOKEN", "HF_TOKEN"], strict=False),
1154               huggingface_pipeline_kwargs: Optional[dict[str, Any]] = None)
1155  ```
1156  
1157  Initializes the TransformersZeroShotTextRouter component.
1158  
1159  **Arguments**:
1160  
1161  - `labels`: The set of labels to use for classification. Can be a single label,
1162  a string of comma-separated labels, or a list of labels.
1163  - `multi_label`: Indicates if multiple labels can be true.
1164  If `False`, label scores are normalized so their sum equals 1 for each sequence.
1165  If `True`, the labels are considered independent and probabilities are normalized for each candidate by
1166  doing a softmax of the entailment score vs. the contradiction score.
1167  - `model`: The name or path of a Hugging Face model for zero-shot text classification.
1168  - `device`: The device for loading the model. If `None`, automatically selects the default device.
1169  If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
1170  - `token`: The API token used to download private models from Hugging Face.
1171  If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
1172  To generate these tokens, run `transformers-cli login`.
1173  - `huggingface_pipeline_kwargs`: A dictionary of keyword arguments for initializing the Hugging Face
1174  zero shot text classification.
1175  
1176  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.warm_up"></a>
1177  
1178  #### TransformersZeroShotTextRouter.warm\_up
1179  
1180  ```python
1181  def warm_up()
1182  ```
1183  
1184  Initializes the component.
1185  
1186  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.to_dict"></a>
1187  
1188  #### TransformersZeroShotTextRouter.to\_dict
1189  
1190  ```python
1191  def to_dict() -> dict[str, Any]
1192  ```
1193  
1194  Serializes the component to a dictionary.
1195  
1196  **Returns**:
1197  
1198  Dictionary with serialized data.
1199  
1200  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.from_dict"></a>
1201  
1202  #### TransformersZeroShotTextRouter.from\_dict
1203  
1204  ```python
1205  @classmethod
1206  def from_dict(cls, data: dict[str, Any]) -> "TransformersZeroShotTextRouter"
1207  ```
1208  
1209  Deserializes the component from a dictionary.
1210  
1211  **Arguments**:
1212  
1213  - `data`: Dictionary to deserialize from.
1214  
1215  **Returns**:
1216  
1217  Deserialized component.
1218  
1219  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.run"></a>
1220  
1221  #### TransformersZeroShotTextRouter.run
1222  
1223  ```python
1224  def run(text: str) -> dict[str, str]
1225  ```
1226  
1227  Routes the text strings to different connections based on a category label.
1228  
1229  **Arguments**:
1230  
1231  - `text`: A string of text to route.
1232  
1233  **Raises**:
1234  
1235  - `TypeError`: If the input is not a str.
1236  - `RuntimeError`: If the pipeline has not been loaded because warm_up() was not called before.
1237  
1238  **Returns**:
1239  
1240  A dictionary with the label as key and the text as value.
1241