/ docs-website / reference / experiments-api / experimental_preprocessors_api.md
experimental_preprocessors_api.md
 1  ---
 2  title: "Preprocessors"
 3  id: experimental-preprocessors-api
 4  description: "Pipelines wrapped as components."
 5  slug: "/experimental-preprocessors-api"
 6  ---
 7  
 8  <a id="haystack_experimental.components.preprocessors.md_header_level_inferrer"></a>
 9  
10  ## Module haystack\_experimental.components.preprocessors.md\_header\_level\_inferrer
11  
12  <a id="haystack_experimental.components.preprocessors.md_header_level_inferrer.MarkdownHeaderLevelInferrer"></a>
13  
14  ### MarkdownHeaderLevelInferrer
15  
16  Infers and rewrites header levels in Markdown text to normalize hierarchy.
17  
18      First header → Always becomes level 1 (#)
19      Subsequent headers → Level increases if no content between headers, stays same if content exists
20      Maximum level → Capped at 6 (######)
21  
22      ### Usage example
23      ```python
24      from haystack import Document
25      from haystack_experimental.components.preprocessors import MarkdownHeaderLevelInferrer
26  
27      # Create a document with uniform header levels
28      text = "## Title
29  ## Subheader
30  Section
31  ## Subheader
32  More Content"
33      doc = Document(content=text)
34  
35      # Initialize the inferrer and process the document
36      inferrer = MarkdownHeaderLevelInferrer()
37      result = inferrer.run([doc])
38  
39      # The headers are now normalized with proper hierarchy
40      print(result["documents"][0].content)
41      > # Title
42  ## Subheader
43  Section
44  ## Subheader
45  More Content
46      ```
47  
48  <a id="haystack_experimental.components.preprocessors.md_header_level_inferrer.MarkdownHeaderLevelInferrer.__init__"></a>
49  
50  #### MarkdownHeaderLevelInferrer.\_\_init\_\_
51  
52  ```python
53  def __init__()
54  ```
55  
56  Initializes the MarkdownHeaderLevelInferrer.
57  
58  <a id="haystack_experimental.components.preprocessors.md_header_level_inferrer.MarkdownHeaderLevelInferrer.run"></a>
59  
60  #### MarkdownHeaderLevelInferrer.run
61  
62  ```python
63  @component.output_types(documents=list[Document])
64  def run(documents: list[Document]) -> dict
65  ```
66  
67  Infers and rewrites the header levels in the content for documents that use uniform header levels.
68  
69  **Arguments**:
70  
71  - `documents`: list of Document objects to process.
72  
73  **Returns**:
74  
75  dict: a dictionary with the key 'documents' containing the processed Document objects.
76