Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.18 / haystack-api / routers_api.md
routers_api.md
   1  ---
   2  title: Routers
   3  id: routers-api
   4  description: Routers is a group of components that route queries or Documents to other components that can handle them best.
   5  slug: "/routers-api"
   6  ---
   7  
   8  <a id="conditional_router"></a>
   9  
  10  # Module conditional\_router
  11  
  12  <a id="conditional_router.NoRouteSelectedException"></a>
  13  
  14  ## NoRouteSelectedException
  15  
  16  Exception raised when no route is selected in ConditionalRouter.
  17  
  18  <a id="conditional_router.RouteConditionException"></a>
  19  
  20  ## RouteConditionException
  21  
  22  Exception raised when there is an error parsing or evaluating the condition expression in ConditionalRouter.
  23  
  24  <a id="conditional_router.ConditionalRouter"></a>
  25  
  26  ## ConditionalRouter
  27  
  28  Routes data based on specific conditions.
  29  
  30  You define these conditions in a list of dictionaries called `routes`.
  31  Each dictionary in this list represents a single route. Each route has these four elements:
  32  - `condition`: A Jinja2 string expression that determines if the route is selected.
  33  - `output`: A Jinja2 expression defining the route's output value.
  34  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
  35  - `output_name`: The name you want to use to publish `output`. This name is used to connect
  36  the router to other components in the pipeline.
  37  
  38  ### Usage example
  39  
  40  ```python
  41  from haystack.components.routers import ConditionalRouter
  42  
  43  routes = [
  44      {
  45          "condition": "{{streams|length > 2}}",
  46          "output": "{{streams}}",
  47          "output_name": "enough_streams",
  48          "output_type": list[int],
  49      },
  50      {
  51          "condition": "{{streams|length <= 2}}",
  52          "output": "{{streams}}",
  53          "output_name": "insufficient_streams",
  54          "output_type": list[int],
  55      },
  56  ]
  57  router = ConditionalRouter(routes)
  58  # When 'streams' has more than 2 items, 'enough_streams' output will activate, emitting the list [1, 2, 3]
  59  kwargs = {"streams": [1, 2, 3], "query": "Haystack"}
  60  result = router.run(**kwargs)
  61  assert result == {"enough_streams": [1, 2, 3]}
  62  ```
  63  
  64  In this example, we configure two routes. The first route sends the 'streams' value to 'enough_streams' if the
  65  stream count exceeds two. The second route directs 'streams' to 'insufficient_streams' if there
  66  are two or fewer streams.
  67  
  68  In the pipeline setup, the Router connects to other components using the output names. For example,
  69  'enough_streams' might connect to a component that processes streams, while
  70  'insufficient_streams' might connect to a component that fetches more streams.
  71  
  72  
  73  Here is a pipeline that uses `ConditionalRouter` and routes the fetched `ByteStreams` to
  74  different components depending on the number of streams fetched:
  75  
  76  ```python
  77  from haystack import Pipeline
  78  from haystack.dataclasses import ByteStream
  79  from haystack.components.routers import ConditionalRouter
  80  
  81  routes = [
  82      {
  83          "condition": "{{streams|length > 2}}",
  84          "output": "{{streams}}",
  85          "output_name": "enough_streams",
  86          "output_type": list[ByteStream],
  87      },
  88      {
  89          "condition": "{{streams|length <= 2}}",
  90          "output": "{{streams}}",
  91          "output_name": "insufficient_streams",
  92          "output_type": list[ByteStream],
  93      },
  94  ]
  95  
  96  pipe = Pipeline()
  97  pipe.add_component("router", router)
  98  ...
  99  pipe.connect("router.enough_streams", "some_component_a.streams")
 100  pipe.connect("router.insufficient_streams", "some_component_b.streams_or_some_other_input")
 101  ...
 102  ```
 103  
 104  <a id="conditional_router.ConditionalRouter.__init__"></a>
 105  
 106  #### ConditionalRouter.\_\_init\_\_
 107  
 108  ```python
 109  def __init__(routes: list[Route],
 110               custom_filters: Optional[dict[str, Callable]] = None,
 111               unsafe: bool = False,
 112               validate_output_type: bool = False,
 113               optional_variables: Optional[list[str]] = None)
 114  ```
 115  
 116  Initializes the `ConditionalRouter` with a list of routes detailing the conditions for routing.
 117  
 118  **Arguments**:
 119  
 120  - `routes`: A list of dictionaries, each defining a route.
 121  Each route has these four elements:
 122  - `condition`: A Jinja2 string expression that determines if the route is selected.
 123  - `output`: A Jinja2 expression defining the route's output value.
 124  - `output_type`: The type of the output data (for example, `str`, `list[int]`).
 125  - `output_name`: The name you want to use to publish `output`. This name is used to connect
 126  the router to other components in the pipeline.
 127  - `custom_filters`: A dictionary of custom Jinja2 filters used in the condition expressions.
 128  For example, passing `{"my_filter": my_filter_fcn}` where:
 129  - `my_filter` is the name of the custom filter.
 130  - `my_filter_fcn` is a callable that takes `my_var:str` and returns `my_var[:3]`.
 131    `{{ my_var|my_filter }}` can then be used inside a route condition expression:
 132      `"condition": "{{ my_var|my_filter == 'foo' }}"`.
 133  - `unsafe`: Enable execution of arbitrary code in the Jinja template.
 134  This should only be used if you trust the source of the template as it can be lead to remote code execution.
 135  - `validate_output_type`: Enable validation of routes' output.
 136  If a route output doesn't match the declared type a ValueError is raised running.
 137  - `optional_variables`: A list of variable names that are optional in your route conditions and outputs.
 138  If these variables are not provided at runtime, they will be set to `None`.
 139  This allows you to write routes that can handle missing inputs gracefully without raising errors.
 140  
 141  Example usage with a default fallback route in a Pipeline:
 142  ```python
 143  from haystack import Pipeline
 144  from haystack.components.routers import ConditionalRouter
 145  
 146  routes = [
 147      {
 148          "condition": '{{ path == "rag" }}',
 149          "output": "{{ question }}",
 150          "output_name": "rag_route",
 151          "output_type": str
 152      },
 153      {
 154          "condition": "{{ True }}",  # fallback route
 155          "output": "{{ question }}",
 156          "output_name": "default_route",
 157          "output_type": str
 158      }
 159  ]
 160  
 161  router = ConditionalRouter(routes, optional_variables=["path"])
 162  pipe = Pipeline()
 163  pipe.add_component("router", router)
 164  
 165  # When 'path' is provided in the pipeline:
 166  result = pipe.run(data={"router": {"question": "What?", "path": "rag"}})
 167  assert result["router"] == {"rag_route": "What?"}
 168  
 169  # When 'path' is not provided, fallback route is taken:
 170  result = pipe.run(data={"router": {"question": "What?"}})
 171  assert result["router"] == {"default_route": "What?"}
 172  ```
 173  
 174  This pattern is particularly useful when:
 175  - You want to provide default/fallback behavior when certain inputs are missing
 176  - Some variables are only needed for specific routing conditions
 177  - You're building flexible pipelines where not all inputs are guaranteed to be present
 178  
 179  <a id="conditional_router.ConditionalRouter.to_dict"></a>
 180  
 181  #### ConditionalRouter.to\_dict
 182  
 183  ```python
 184  def to_dict() -> dict[str, Any]
 185  ```
 186  
 187  Serializes the component to a dictionary.
 188  
 189  **Returns**:
 190  
 191  Dictionary with serialized data.
 192  
 193  <a id="conditional_router.ConditionalRouter.from_dict"></a>
 194  
 195  #### ConditionalRouter.from\_dict
 196  
 197  ```python
 198  @classmethod
 199  def from_dict(cls, data: dict[str, Any]) -> "ConditionalRouter"
 200  ```
 201  
 202  Deserializes the component from a dictionary.
 203  
 204  **Arguments**:
 205  
 206  - `data`: The dictionary to deserialize from.
 207  
 208  **Returns**:
 209  
 210  The deserialized component.
 211  
 212  <a id="conditional_router.ConditionalRouter.run"></a>
 213  
 214  #### ConditionalRouter.run
 215  
 216  ```python
 217  def run(**kwargs)
 218  ```
 219  
 220  Executes the routing logic.
 221  
 222  Executes the routing logic by evaluating the specified boolean condition expressions for each route in the
 223  order they are listed. The method directs the flow of data to the output specified in the first route whose
 224  `condition` is True.
 225  
 226  **Arguments**:
 227  
 228  - `kwargs`: All variables used in the `condition` expressed in the routes. When the component is used in a
 229  pipeline, these variables are passed from the previous component's output.
 230  
 231  **Raises**:
 232  
 233  - `NoRouteSelectedException`: If no `condition' in the routes is `True`.
 234  - `RouteConditionException`: If there is an error parsing or evaluating the `condition` expression in the routes.
 235  - `ValueError`: If type validation is enabled and route type doesn't match actual value type.
 236  
 237  **Returns**:
 238  
 239  A dictionary where the key is the `output_name` of the selected route and the value is the `output`
 240  of the selected route.
 241  
 242  <a id="document_length_router"></a>
 243  
 244  # Module document\_length\_router
 245  
 246  <a id="document_length_router.DocumentLengthRouter"></a>
 247  
 248  ## DocumentLengthRouter
 249  
 250  Categorizes documents based on the length of the `content` field and routes them to the appropriate output.
 251  
 252  A common use case for DocumentLengthRouter is handling documents obtained from PDFs that contain non-text
 253  content, such as scanned pages or images. This component can detect empty or low-content documents and route them to
 254  components that perform OCR, generate captions, or compute image embeddings.
 255  
 256  ### Usage example
 257  
 258  ```python
 259  from haystack.components.routers import DocumentLengthRouter
 260  from haystack.dataclasses import Document
 261  
 262  docs = [
 263      Document(content="Short"),
 264      Document(content="Long document "*20),
 265  ]
 266  
 267  router = DocumentLengthRouter(threshold=10)
 268  
 269  result = router.run(documents=docs)
 270  print(result)
 271  
 272  # {
 273  #     "short_documents": [Document(content="Short", ...)],
 274  #     "long_documents": [Document(content="Long document ...", ...)],
 275  # }
 276  ```
 277  
 278  <a id="document_length_router.DocumentLengthRouter.__init__"></a>
 279  
 280  #### DocumentLengthRouter.\_\_init\_\_
 281  
 282  ```python
 283  def __init__(*, threshold: int = 10) -> None
 284  ```
 285  
 286  Initialize the DocumentLengthRouter component.
 287  
 288  **Arguments**:
 289  
 290  - `threshold`: The threshold for the number of characters in the document `content` field. Documents where `content` is
 291  None or whose character count is less than or equal to the threshold will be routed to the `short_documents`
 292  output. Otherwise, they will be routed to the `long_documents` output.
 293  To route only documents with None content to `short_documents`, set the threshold to a negative number.
 294  
 295  <a id="document_length_router.DocumentLengthRouter.run"></a>
 296  
 297  #### DocumentLengthRouter.run
 298  
 299  ```python
 300  @component.output_types(short_documents=list[Document],
 301                          long_documents=list[Document])
 302  def run(documents: list[Document]) -> dict[str, list[Document]]
 303  ```
 304  
 305  Categorize input documents into groups based on the length of the `content` field.
 306  
 307  **Arguments**:
 308  
 309  - `documents`: A list of documents to be categorized.
 310  
 311  **Returns**:
 312  
 313  A dictionary with the following keys:
 314  - `short_documents`: A list of documents where `content` is None or the length of `content` is less than or
 315     equal to the threshold.
 316  - `long_documents`: A list of documents where the length of `content` is greater than the threshold.
 317  
 318  <a id="document_type_router"></a>
 319  
 320  # Module document\_type\_router
 321  
 322  <a id="document_type_router.DocumentTypeRouter"></a>
 323  
 324  ## DocumentTypeRouter
 325  
 326  Routes documents by their MIME types.
 327  
 328  DocumentTypeRouter is used to dynamically route documents within a pipeline based on their MIME types.
 329  It supports exact MIME type matches and regex patterns.
 330  
 331  MIME types can be extracted directly from document metadata or inferred from file paths using standard or
 332  user-supplied MIME type mappings.
 333  
 334  ### Usage example
 335  
 336  ```python
 337  from haystack.components.routers import DocumentTypeRouter
 338  from haystack.dataclasses import Document
 339  
 340  docs = [
 341      Document(content="Example text", meta={"file_path": "example.txt"}),
 342      Document(content="Another document", meta={"mime_type": "application/pdf"}),
 343      Document(content="Unknown type")
 344  ]
 345  
 346  router = DocumentTypeRouter(
 347      mime_type_meta_field="mime_type",
 348      file_path_meta_field="file_path",
 349      mime_types=["text/plain", "application/pdf"]
 350  )
 351  
 352  result = router.run(documents=docs)
 353  print(result)
 354  ```
 355  
 356  Expected output:
 357  ```python
 358  {
 359      "text/plain": [Document(...)],
 360      "application/pdf": [Document(...)],
 361      "unclassified": [Document(...)]
 362  }
 363  ```
 364  
 365  <a id="document_type_router.DocumentTypeRouter.__init__"></a>
 366  
 367  #### DocumentTypeRouter.\_\_init\_\_
 368  
 369  ```python
 370  def __init__(*,
 371               mime_types: list[str],
 372               mime_type_meta_field: Optional[str] = None,
 373               file_path_meta_field: Optional[str] = None,
 374               additional_mimetypes: Optional[dict[str, str]] = None) -> None
 375  ```
 376  
 377  Initialize the DocumentTypeRouter component.
 378  
 379  **Arguments**:
 380  
 381  - `mime_types`: A list of MIME types or regex patterns to classify the input documents.
 382  (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 383  - `mime_type_meta_field`: Optional name of the metadata field that holds the MIME type.
 384  - `file_path_meta_field`: Optional name of the metadata field that holds the file path. Used to infer the MIME type if
 385  `mime_type_meta_field` is not provided or missing in a document.
 386  - `additional_mimetypes`: Optional dictionary mapping MIME types to file extensions to enhance or override the standard
 387  `mimetypes` module. Useful when working with uncommon or custom file types.
 388  For example: `{"application/vnd.custom-type": ".custom"}`.
 389  
 390  **Raises**:
 391  
 392  - `ValueError`: If `mime_types` is empty or if both `mime_type_meta_field` and `file_path_meta_field` are
 393  not provided.
 394  
 395  <a id="document_type_router.DocumentTypeRouter.run"></a>
 396  
 397  #### DocumentTypeRouter.run
 398  
 399  ```python
 400  def run(documents: list[Document]) -> dict[str, list[Document]]
 401  ```
 402  
 403  Categorize input documents into groups based on their MIME type.
 404  
 405  MIME types can either be directly available in document metadata or derived from file paths using the
 406  standard Python `mimetypes` module and custom mappings.
 407  
 408  **Arguments**:
 409  
 410  - `documents`: A list of documents to be categorized.
 411  
 412  **Returns**:
 413  
 414  A dictionary where the keys are MIME types (or `"unclassified"`) and the values are lists of documents.
 415  
 416  <a id="file_type_router"></a>
 417  
 418  # Module file\_type\_router
 419  
 420  <a id="file_type_router.FileTypeRouter"></a>
 421  
 422  ## FileTypeRouter
 423  
 424  Categorizes files or byte streams by their MIME types, helping in context-based routing.
 425  
 426  FileTypeRouter supports both exact MIME type matching and regex patterns.
 427  
 428  For file paths, MIME types come from extensions, while byte streams use metadata.
 429  You can use regex patterns in the `mime_types` parameter to set broad categories
 430  (such as 'audio/*' or 'text/*') or specific types.
 431  MIME types without regex patterns are treated as exact matches.
 432  
 433  ### Usage example
 434  
 435  ```python
 436  from haystack.components.routers import FileTypeRouter
 437  from pathlib import Path
 438  
 439  # For exact MIME type matching
 440  router = FileTypeRouter(mime_types=["text/plain", "application/pdf"])
 441  
 442  # For flexible matching using regex, to handle all audio types
 443  router_with_regex = FileTypeRouter(mime_types=[r"audio/.*", r"text/plain"])
 444  
 445  sources = [Path("file.txt"), Path("document.pdf"), Path("song.mp3")]
 446  print(router.run(sources=sources))
 447  print(router_with_regex.run(sources=sources))
 448  
 449  # Expected output:
 450  # {'text/plain': [
 451  #   PosixPath('file.txt')], 'application/pdf': [PosixPath('document.pdf')], 'unclassified': [PosixPath('song.mp3')
 452  # ]}
 453  # {'audio/.*': [
 454  #   PosixPath('song.mp3')], 'text/plain': [PosixPath('file.txt')], 'unclassified': [PosixPath('document.pdf')
 455  # ]}
 456  ```
 457  
 458  <a id="file_type_router.FileTypeRouter.__init__"></a>
 459  
 460  #### FileTypeRouter.\_\_init\_\_
 461  
 462  ```python
 463  def __init__(mime_types: list[str],
 464               additional_mimetypes: Optional[dict[str, str]] = None,
 465               raise_on_failure: bool = False)
 466  ```
 467  
 468  Initialize the FileTypeRouter component.
 469  
 470  **Arguments**:
 471  
 472  - `mime_types`: A list of MIME types or regex patterns to classify the input files or byte streams.
 473  (for example: `["text/plain", "audio/x-wav", "image/jpeg"]`).
 474  - `additional_mimetypes`: A dictionary containing the MIME type to add to the mimetypes package to prevent unsupported or non-native
 475  packages from being unclassified.
 476  (for example: `{"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx"}`).
 477  - `raise_on_failure`: If True, raises FileNotFoundError when a file path doesn't exist.
 478  If False (default), only emits a warning when a file path doesn't exist.
 479  
 480  <a id="file_type_router.FileTypeRouter.to_dict"></a>
 481  
 482  #### FileTypeRouter.to\_dict
 483  
 484  ```python
 485  def to_dict() -> dict[str, Any]
 486  ```
 487  
 488  Serializes the component to a dictionary.
 489  
 490  **Returns**:
 491  
 492  Dictionary with serialized data.
 493  
 494  <a id="file_type_router.FileTypeRouter.from_dict"></a>
 495  
 496  #### FileTypeRouter.from\_dict
 497  
 498  ```python
 499  @classmethod
 500  def from_dict(cls, data: dict[str, Any]) -> "FileTypeRouter"
 501  ```
 502  
 503  Deserializes the component from a dictionary.
 504  
 505  **Arguments**:
 506  
 507  - `data`: The dictionary to deserialize from.
 508  
 509  **Returns**:
 510  
 511  The deserialized component.
 512  
 513  <a id="file_type_router.FileTypeRouter.run"></a>
 514  
 515  #### FileTypeRouter.run
 516  
 517  ```python
 518  def run(
 519      sources: list[Union[str, Path, ByteStream]],
 520      meta: Optional[Union[dict[str, Any], list[dict[str, Any]]]] = None
 521  ) -> dict[str, list[Union[ByteStream, Path]]]
 522  ```
 523  
 524  Categorize files or byte streams according to their MIME types.
 525  
 526  **Arguments**:
 527  
 528  - `sources`: A list of file paths or byte streams to categorize.
 529  - `meta`: Optional metadata to attach to the sources.
 530  When provided, the sources are internally converted to ByteStream objects and the metadata is added.
 531  This value can be a list of dictionaries or a single dictionary.
 532  If it's a single dictionary, its content is added to the metadata of all ByteStream objects.
 533  If it's a list, its length must match the number of sources, as they are zipped together.
 534  
 535  **Returns**:
 536  
 537  A dictionary where the keys are MIME types and the values are lists of data sources.
 538  Two extra keys may be returned: `"unclassified"` when a source's MIME type doesn't match any pattern
 539  and `"failed"` when a source cannot be processed (for example, a file path that doesn't exist).
 540  
 541  <a id="llm_messages_router"></a>
 542  
 543  # Module llm\_messages\_router
 544  
 545  <a id="llm_messages_router.LLMMessagesRouter"></a>
 546  
 547  ## LLMMessagesRouter
 548  
 549  Routes Chat Messages to different connections using a generative Language Model to perform classification.
 550  
 551      This component can be used with general-purpose LLMs and with specialized LLMs for moderation like Llama Guard.
 552  
 553      ### Usage example
 554      ```python
 555      from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
 556      from haystack.components.routers.llm_messages_router import LLMMessagesRouter
 557      from haystack.dataclasses import ChatMessage
 558  
 559      # initialize a Chat Generator with a generative model for moderation
 560      chat_generator = HuggingFaceAPIChatGenerator(
 561          api_type="serverless_inference_api",
 562          api_params={"model": "meta-llama/Llama-Guard-4-12B", "provider": "groq"},
 563      )
 564  
 565      router = LLMMessagesRouter(chat_generator=chat_generator,
 566                                  output_names=["unsafe", "safe"],
 567                                  output_patterns=["unsafe", "safe"])
 568  
 569  
 570      print(router.run([ChatMessage.from_user("How to rob a bank?")]))
 571  
 572      # {
 573      #     'chat_generator_text': 'unsafe
 574  S2',
 575      #     'unsafe': [
 576      #         ChatMessage(
 577      #             _role=<ChatRole.USER: 'user'>,
 578      #             _content=[TextContent(text='How to rob a bank?')],
 579      #             _name=None,
 580      #             _meta={}
 581      #         )
 582      #     ]
 583      # }
 584      ```
 585  
 586  <a id="llm_messages_router.LLMMessagesRouter.__init__"></a>
 587  
 588  #### LLMMessagesRouter.\_\_init\_\_
 589  
 590  ```python
 591  def __init__(chat_generator: ChatGenerator,
 592               output_names: list[str],
 593               output_patterns: list[str],
 594               system_prompt: Optional[str] = None)
 595  ```
 596  
 597  Initialize the LLMMessagesRouter component.
 598  
 599  **Arguments**:
 600  
 601  - `chat_generator`: A ChatGenerator instance which represents the LLM.
 602  - `output_names`: A list of output connection names. These can be used to connect the router to other
 603  components.
 604  - `output_patterns`: A list of regular expressions to be matched against the output of the LLM. Each pattern
 605  corresponds to an output name. Patterns are evaluated in order.
 606  When using moderation models, refer to the model card to understand the expected outputs.
 607  - `system_prompt`: An optional system prompt to customize the behavior of the LLM.
 608  For moderation models, refer to the model card for supported customization options.
 609  
 610  **Raises**:
 611  
 612  - `ValueError`: If output_names and output_patterns are not non-empty lists of the same length.
 613  
 614  <a id="llm_messages_router.LLMMessagesRouter.warm_up"></a>
 615  
 616  #### LLMMessagesRouter.warm\_up
 617  
 618  ```python
 619  def warm_up()
 620  ```
 621  
 622  Warm up the underlying LLM.
 623  
 624  <a id="llm_messages_router.LLMMessagesRouter.run"></a>
 625  
 626  #### LLMMessagesRouter.run
 627  
 628  ```python
 629  def run(messages: list[ChatMessage]
 630          ) -> dict[str, Union[str, list[ChatMessage]]]
 631  ```
 632  
 633  Classify the messages based on LLM output and route them to the appropriate output connection.
 634  
 635  **Arguments**:
 636  
 637  - `messages`: A list of ChatMessages to be routed. Only user and assistant messages are supported.
 638  
 639  **Raises**:
 640  
 641  - `ValueError`: If messages is an empty list or contains messages with unsupported roles.
 642  - `RuntimeError`: If the component is not warmed up and the ChatGenerator has a warm_up method.
 643  
 644  **Returns**:
 645  
 646  A dictionary with the following keys:
 647  - "chat_generator_text": The text output of the LLM, useful for debugging.
 648  - "output_names": Each contains the list of messages that matched the corresponding pattern.
 649  - "unmatched": The messages that did not match any of the output patterns.
 650  
 651  <a id="llm_messages_router.LLMMessagesRouter.to_dict"></a>
 652  
 653  #### LLMMessagesRouter.to\_dict
 654  
 655  ```python
 656  def to_dict() -> dict[str, Any]
 657  ```
 658  
 659  Serialize this component to a dictionary.
 660  
 661  **Returns**:
 662  
 663  The serialized component as a dictionary.
 664  
 665  <a id="llm_messages_router.LLMMessagesRouter.from_dict"></a>
 666  
 667  #### LLMMessagesRouter.from\_dict
 668  
 669  ```python
 670  @classmethod
 671  def from_dict(cls, data: dict[str, Any]) -> "LLMMessagesRouter"
 672  ```
 673  
 674  Deserialize this component from a dictionary.
 675  
 676  **Arguments**:
 677  
 678  - `data`: The dictionary representation of this component.
 679  
 680  **Returns**:
 681  
 682  The deserialized component instance.
 683  
 684  <a id="metadata_router"></a>
 685  
 686  # Module metadata\_router
 687  
 688  <a id="metadata_router.MetadataRouter"></a>
 689  
 690  ## MetadataRouter
 691  
 692  Routes documents or byte streams to different connections based on their metadata fields.
 693  
 694  Specify the routing rules in the `init` method.
 695  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 696  
 697  
 698  ### Usage examples
 699  
 700  **Routing Documents by metadata:**
 701  ```python
 702  from haystack import Document
 703  from haystack.components.routers import MetadataRouter
 704  
 705  docs = [Document(content="Paris is the capital of France.", meta={"language": "en"}),
 706          Document(content="Berlin ist die Haupststadt von Deutschland.", meta={"language": "de"})]
 707  
 708  router = MetadataRouter(rules={"en": {"field": "meta.language", "operator": "==", "value": "en"}})
 709  
 710  print(router.run(documents=docs))
 711  # {'en': [Document(id=..., content: 'Paris is the capital of France.', meta: {'language': 'en'})],
 712  # 'unmatched': [Document(id=..., content: 'Berlin ist die Haupststadt von Deutschland.', meta: {'language': 'de'})]}
 713  ```
 714  
 715  **Routing ByteStreams by metadata:**
 716  ```python
 717  from haystack.dataclasses import ByteStream
 718  from haystack.components.routers import MetadataRouter
 719  
 720  streams = [
 721      ByteStream.from_string("Hello world", meta={"language": "en"}),
 722      ByteStream.from_string("Bonjour le monde", meta={"language": "fr"})
 723  ]
 724  
 725  router = MetadataRouter(
 726      rules={"english": {"field": "meta.language", "operator": "==", "value": "en"}},
 727      output_type=list[ByteStream]
 728  )
 729  
 730  result = router.run(documents=streams)
 731  # {'english': [ByteStream(...)], 'unmatched': [ByteStream(...)]}
 732  ```
 733  
 734  <a id="metadata_router.MetadataRouter.__init__"></a>
 735  
 736  #### MetadataRouter.\_\_init\_\_
 737  
 738  ```python
 739  def __init__(rules: dict[str, dict],
 740               output_type: type = list[Document]) -> None
 741  ```
 742  
 743  Initializes the MetadataRouter component.
 744  
 745  **Arguments**:
 746  
 747  - `rules`: A dictionary defining how to route documents or byte streams to output connections based on their
 748  metadata. Keys are output connection names, and values are dictionaries of
 749  [filtering expressions](https://docs.haystack.deepset.ai/docs/metadata-filtering) in Haystack.
 750  For example:
 751  ```python
 752  {
 753  "edge_1": {
 754      "operator": "AND",
 755      "conditions": [
 756          {"field": "meta.created_at", "operator": ">=", "value": "2023-01-01"},
 757          {"field": "meta.created_at", "operator": "<", "value": "2023-04-01"},
 758      ],
 759  },
 760  "edge_2": {
 761      "operator": "AND",
 762      "conditions": [
 763          {"field": "meta.created_at", "operator": ">=", "value": "2023-04-01"},
 764          {"field": "meta.created_at", "operator": "<", "value": "2023-07-01"},
 765      ],
 766  },
 767  "edge_3": {
 768      "operator": "AND",
 769      "conditions": [
 770          {"field": "meta.created_at", "operator": ">=", "value": "2023-07-01"},
 771          {"field": "meta.created_at", "operator": "<", "value": "2023-10-01"},
 772      ],
 773  },
 774  "edge_4": {
 775      "operator": "AND",
 776      "conditions": [
 777          {"field": "meta.created_at", "operator": ">=", "value": "2023-10-01"},
 778          {"field": "meta.created_at", "operator": "<", "value": "2024-01-01"},
 779      ],
 780  },
 781  }
 782  ```
 783  :param output_type: The type of the output produced. Lists of Documents or ByteStreams can be specified.
 784  
 785  <a id="metadata_router.MetadataRouter.run"></a>
 786  
 787  #### MetadataRouter.run
 788  
 789  ```python
 790  def run(documents: Union[list[Document], list[ByteStream]])
 791  ```
 792  
 793  Routes documents or byte streams to different connections based on their metadata fields.
 794  
 795  If a document or byte stream does not match any of the rules, it's routed to a connection named "unmatched".
 796  
 797  **Arguments**:
 798  
 799  - `documents`: A list of `Document` or `ByteStream` objects to be routed based on their metadata.
 800  
 801  **Returns**:
 802  
 803  A dictionary where the keys are the names of the output connections (including `"unmatched"`)
 804  and the values are lists of `Document` or `ByteStream` objects that matched the corresponding rules.
 805  
 806  <a id="metadata_router.MetadataRouter.to_dict"></a>
 807  
 808  #### MetadataRouter.to\_dict
 809  
 810  ```python
 811  def to_dict() -> dict[str, Any]
 812  ```
 813  
 814  Serialize this component to a dictionary.
 815  
 816  **Returns**:
 817  
 818  The serialized component as a dictionary.
 819  
 820  <a id="metadata_router.MetadataRouter.from_dict"></a>
 821  
 822  #### MetadataRouter.from\_dict
 823  
 824  ```python
 825  @classmethod
 826  def from_dict(cls, data: dict[str, Any]) -> "MetadataRouter"
 827  ```
 828  
 829  Deserialize this component from a dictionary.
 830  
 831  **Arguments**:
 832  
 833  - `data`: The dictionary representation of this component.
 834  
 835  **Returns**:
 836  
 837  The deserialized component instance.
 838  
 839  <a id="text_language_router"></a>
 840  
 841  # Module text\_language\_router
 842  
 843  <a id="text_language_router.TextLanguageRouter"></a>
 844  
 845  ## TextLanguageRouter
 846  
 847  Routes text strings to different output connections based on their language.
 848  
 849  Provide a list of languages during initialization. If the document's text doesn't match any of the
 850  specified languages, the metadata value is set to "unmatched".
 851  For routing documents based on their language, use the DocumentLanguageClassifier component,
 852  followed by the MetaDataRouter.
 853  
 854  ### Usage example
 855  
 856  ```python
 857  from haystack import Pipeline, Document
 858  from haystack.components.routers import TextLanguageRouter
 859  from haystack.document_stores.in_memory import InMemoryDocumentStore
 860  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 861  
 862  document_store = InMemoryDocumentStore()
 863  document_store.write_documents([Document(content="Elvis Presley was an American singer and actor.")])
 864  
 865  p = Pipeline()
 866  p.add_component(instance=TextLanguageRouter(languages=["en"]), name="text_language_router")
 867  p.add_component(instance=InMemoryBM25Retriever(document_store=document_store), name="retriever")
 868  p.connect("text_language_router.en", "retriever.query")
 869  
 870  result = p.run({"text_language_router": {"text": "Who was Elvis Presley?"}})
 871  assert result["retriever"]["documents"][0].content == "Elvis Presley was an American singer and actor."
 872  
 873  result = p.run({"text_language_router": {"text": "ένα ελληνικό κείμενο"}})
 874  assert result["text_language_router"]["unmatched"] == "ένα ελληνικό κείμενο"
 875  ```
 876  
 877  <a id="text_language_router.TextLanguageRouter.__init__"></a>
 878  
 879  #### TextLanguageRouter.\_\_init\_\_
 880  
 881  ```python
 882  def __init__(languages: Optional[list[str]] = None)
 883  ```
 884  
 885  Initialize the TextLanguageRouter component.
 886  
 887  **Arguments**:
 888  
 889  - `languages`: A list of ISO language codes.
 890  See the supported languages in [`langdetect` documentation](https://github.com/Mimino666/langdetect#languages).
 891  If not specified, defaults to ["en"].
 892  
 893  <a id="text_language_router.TextLanguageRouter.run"></a>
 894  
 895  #### TextLanguageRouter.run
 896  
 897  ```python
 898  def run(text: str) -> dict[str, str]
 899  ```
 900  
 901  Routes the text strings to different output connections based on their language.
 902  
 903  If the document's text doesn't match any of the specified languages, the metadata value is set to "unmatched".
 904  
 905  **Arguments**:
 906  
 907  - `text`: A text string to route.
 908  
 909  **Raises**:
 910  
 911  - `TypeError`: If the input is not a string.
 912  
 913  **Returns**:
 914  
 915  A dictionary in which the key is the language (or `"unmatched"`),
 916  and the value is the text.
 917  
 918  <a id="transformers_text_router"></a>
 919  
 920  # Module transformers\_text\_router
 921  
 922  <a id="transformers_text_router.TransformersTextRouter"></a>
 923  
 924  ## TransformersTextRouter
 925  
 926  Routes the text strings to different connections based on a category label.
 927  
 928  The labels are specific to each model and can be found it its description on Hugging Face.
 929  
 930  ### Usage example
 931  
 932  ```python
 933  from haystack.core.pipeline import Pipeline
 934  from haystack.components.routers import TransformersTextRouter
 935  from haystack.components.builders import PromptBuilder
 936  from haystack.components.generators import HuggingFaceLocalGenerator
 937  
 938  p = Pipeline()
 939  p.add_component(
 940      instance=TransformersTextRouter(model="papluca/xlm-roberta-base-language-detection"),
 941      name="text_router"
 942  )
 943  p.add_component(
 944      instance=PromptBuilder(template="Answer the question: {{query}}\nAnswer:"),
 945      name="english_prompt_builder"
 946  )
 947  p.add_component(
 948      instance=PromptBuilder(template="Beantworte die Frage: {{query}}\nAntwort:"),
 949      name="german_prompt_builder"
 950  )
 951  
 952  p.add_component(
 953      instance=HuggingFaceLocalGenerator(model="DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1"),
 954      name="german_llm"
 955  )
 956  p.add_component(
 957      instance=HuggingFaceLocalGenerator(model="microsoft/Phi-3-mini-4k-instruct"),
 958      name="english_llm"
 959  )
 960  
 961  p.connect("text_router.en", "english_prompt_builder.query")
 962  p.connect("text_router.de", "german_prompt_builder.query")
 963  p.connect("english_prompt_builder.prompt", "english_llm.prompt")
 964  p.connect("german_prompt_builder.prompt", "german_llm.prompt")
 965  
 966  # English Example
 967  print(p.run({"text_router": {"text": "What is the capital of Germany?"}}))
 968  
 969  # German Example
 970  print(p.run({"text_router": {"text": "Was ist die Hauptstadt von Deutschland?"}}))
 971  ```
 972  
 973  <a id="transformers_text_router.TransformersTextRouter.__init__"></a>
 974  
 975  #### TransformersTextRouter.\_\_init\_\_
 976  
 977  ```python
 978  def __init__(model: str,
 979               labels: Optional[list[str]] = None,
 980               device: Optional[ComponentDevice] = None,
 981               token: Optional[Secret] = Secret.from_env_var(
 982                   ["HF_API_TOKEN", "HF_TOKEN"], strict=False),
 983               huggingface_pipeline_kwargs: Optional[dict[str, Any]] = None)
 984  ```
 985  
 986  Initializes the TransformersTextRouter component.
 987  
 988  **Arguments**:
 989  
 990  - `model`: The name or path of a Hugging Face model for text classification.
 991  - `labels`: The list of labels. If not provided, the component fetches the labels
 992  from the model configuration file hosted on the Hugging Face Hub using
 993  `transformers.AutoConfig.from_pretrained`.
 994  - `device`: The device for loading the model. If `None`, automatically selects the default device.
 995  If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
 996  - `token`: The API token used to download private models from Hugging Face.
 997  If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
 998  To generate these tokens, run `transformers-cli login`.
 999  - `huggingface_pipeline_kwargs`: A dictionary of keyword arguments for initializing the Hugging Face
1000  text classification pipeline.
1001  
1002  <a id="transformers_text_router.TransformersTextRouter.warm_up"></a>
1003  
1004  #### TransformersTextRouter.warm\_up
1005  
1006  ```python
1007  def warm_up()
1008  ```
1009  
1010  Initializes the component.
1011  
1012  <a id="transformers_text_router.TransformersTextRouter.to_dict"></a>
1013  
1014  #### TransformersTextRouter.to\_dict
1015  
1016  ```python
1017  def to_dict() -> dict[str, Any]
1018  ```
1019  
1020  Serializes the component to a dictionary.
1021  
1022  **Returns**:
1023  
1024  Dictionary with serialized data.
1025  
1026  <a id="transformers_text_router.TransformersTextRouter.from_dict"></a>
1027  
1028  #### TransformersTextRouter.from\_dict
1029  
1030  ```python
1031  @classmethod
1032  def from_dict(cls, data: dict[str, Any]) -> "TransformersTextRouter"
1033  ```
1034  
1035  Deserializes the component from a dictionary.
1036  
1037  **Arguments**:
1038  
1039  - `data`: Dictionary to deserialize from.
1040  
1041  **Returns**:
1042  
1043  Deserialized component.
1044  
1045  <a id="transformers_text_router.TransformersTextRouter.run"></a>
1046  
1047  #### TransformersTextRouter.run
1048  
1049  ```python
1050  def run(text: str) -> dict[str, str]
1051  ```
1052  
1053  Routes the text strings to different connections based on a category label.
1054  
1055  **Arguments**:
1056  
1057  - `text`: A string of text to route.
1058  
1059  **Raises**:
1060  
1061  - `TypeError`: If the input is not a str.
1062  - `RuntimeError`: If the pipeline has not been loaded because warm_up() was not called before.
1063  
1064  **Returns**:
1065  
1066  A dictionary with the label as key and the text as value.
1067  
1068  <a id="zero_shot_text_router"></a>
1069  
1070  # Module zero\_shot\_text\_router
1071  
1072  <a id="zero_shot_text_router.TransformersZeroShotTextRouter"></a>
1073  
1074  ## TransformersZeroShotTextRouter
1075  
1076  Routes the text strings to different connections based on a category label.
1077  
1078  Specify the set of labels for categorization when initializing the component.
1079  
1080  ### Usage example
1081  
1082  ```python
1083  from haystack import Document
1084  from haystack.document_stores.in_memory import InMemoryDocumentStore
1085  from haystack.core.pipeline import Pipeline
1086  from haystack.components.routers import TransformersZeroShotTextRouter
1087  from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
1088  from haystack.components.retrievers import InMemoryEmbeddingRetriever
1089  
1090  document_store = InMemoryDocumentStore()
1091  doc_embedder = SentenceTransformersDocumentEmbedder(model="intfloat/e5-base-v2")
1092  doc_embedder.warm_up()
1093  docs = [
1094      Document(
1095          content="Germany, officially the Federal Republic of Germany, is a country in the western region of "
1096          "Central Europe. The nation's capital and most populous city is Berlin and its main financial centre "
1097          "is Frankfurt; the largest urban area is the Ruhr."
1098      ),
1099      Document(
1100          content="France, officially the French Republic, is a country located primarily in Western Europe. "
1101          "France is a unitary semi-presidential republic with its capital in Paris, the country's largest city "
1102          "and main cultural and commercial centre; other major urban areas include Marseille, Lyon, Toulouse, "
1103          "Lille, Bordeaux, Strasbourg, Nantes and Nice."
1104      )
1105  ]
1106  docs_with_embeddings = doc_embedder.run(docs)
1107  document_store.write_documents(docs_with_embeddings["documents"])
1108  
1109  p = Pipeline()
1110  p.add_component(instance=TransformersZeroShotTextRouter(labels=["passage", "query"]), name="text_router")
1111  p.add_component(
1112      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="passage: "),
1113      name="passage_embedder"
1114  )
1115  p.add_component(
1116      instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="query: "),
1117      name="query_embedder"
1118  )
1119  p.add_component(
1120      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1121      name="query_retriever"
1122  )
1123  p.add_component(
1124      instance=InMemoryEmbeddingRetriever(document_store=document_store),
1125      name="passage_retriever"
1126  )
1127  
1128  p.connect("text_router.passage", "passage_embedder.text")
1129  p.connect("passage_embedder.embedding", "passage_retriever.query_embedding")
1130  p.connect("text_router.query", "query_embedder.text")
1131  p.connect("query_embedder.embedding", "query_retriever.query_embedding")
1132  
1133  # Query Example
1134  p.run({"text_router": {"text": "What is the capital of Germany?"}})
1135  
1136  # Passage Example
1137  p.run({
1138      "text_router":{
1139          "text": "The United Kingdom of Great Britain and Northern Ireland, commonly known as the "            "United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the north-western coast of "            "the continental mainland."
1140      }
1141  })
1142  ```
1143  
1144  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.__init__"></a>
1145  
1146  #### TransformersZeroShotTextRouter.\_\_init\_\_
1147  
1148  ```python
1149  def __init__(labels: list[str],
1150               multi_label: bool = False,
1151               model: str = "MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33",
1152               device: Optional[ComponentDevice] = None,
1153               token: Optional[Secret] = Secret.from_env_var(
1154                   ["HF_API_TOKEN", "HF_TOKEN"], strict=False),
1155               huggingface_pipeline_kwargs: Optional[dict[str, Any]] = None)
1156  ```
1157  
1158  Initializes the TransformersZeroShotTextRouter component.
1159  
1160  **Arguments**:
1161  
1162  - `labels`: The set of labels to use for classification. Can be a single label,
1163  a string of comma-separated labels, or a list of labels.
1164  - `multi_label`: Indicates if multiple labels can be true.
1165  If `False`, label scores are normalized so their sum equals 1 for each sequence.
1166  If `True`, the labels are considered independent and probabilities are normalized for each candidate by
1167  doing a softmax of the entailment score vs. the contradiction score.
1168  - `model`: The name or path of a Hugging Face model for zero-shot text classification.
1169  - `device`: The device for loading the model. If `None`, automatically selects the default device.
1170  If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter.
1171  - `token`: The API token used to download private models from Hugging Face.
1172  If `True`, uses either `HF_API_TOKEN` or `HF_TOKEN` environment variables.
1173  To generate these tokens, run `transformers-cli login`.
1174  - `huggingface_pipeline_kwargs`: A dictionary of keyword arguments for initializing the Hugging Face
1175  zero shot text classification.
1176  
1177  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.warm_up"></a>
1178  
1179  #### TransformersZeroShotTextRouter.warm\_up
1180  
1181  ```python
1182  def warm_up()
1183  ```
1184  
1185  Initializes the component.
1186  
1187  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.to_dict"></a>
1188  
1189  #### TransformersZeroShotTextRouter.to\_dict
1190  
1191  ```python
1192  def to_dict() -> dict[str, Any]
1193  ```
1194  
1195  Serializes the component to a dictionary.
1196  
1197  **Returns**:
1198  
1199  Dictionary with serialized data.
1200  
1201  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.from_dict"></a>
1202  
1203  #### TransformersZeroShotTextRouter.from\_dict
1204  
1205  ```python
1206  @classmethod
1207  def from_dict(cls, data: dict[str, Any]) -> "TransformersZeroShotTextRouter"
1208  ```
1209  
1210  Deserializes the component from a dictionary.
1211  
1212  **Arguments**:
1213  
1214  - `data`: Dictionary to deserialize from.
1215  
1216  **Returns**:
1217  
1218  Deserialized component.
1219  
1220  <a id="zero_shot_text_router.TransformersZeroShotTextRouter.run"></a>
1221  
1222  #### TransformersZeroShotTextRouter.run
1223  
1224  ```python
1225  def run(text: str) -> dict[str, str]
1226  ```
1227  
1228  Routes the text strings to different connections based on a category label.
1229  
1230  **Arguments**:
1231  
1232  - `text`: A string of text to route.
1233  
1234  **Raises**:
1235  
1236  - `TypeError`: If the input is not a str.
1237  - `RuntimeError`: If the pipeline has not been loaded because warm_up() was not called before.
1238  
1239  **Returns**:
1240  
1241  A dictionary with the label as key and the text as value.