markitdown.md
 1  ---
 2  title: "Markitdown"
 3  id: integrations-markitdown
 4  description: "Markitdown integration for Haystack"
 5  slug: "/integrations-markitdown"
 6  ---
 7  
 8  
 9  ## haystack_integrations.components.converters.markitdown.markitdown_converter
10  
11  ### MarkItDownConverter
12  
13  Converts files to Haystack Documents using [MarkItDown](https://github.com/microsoft/markitdown).
14  
15  MarkItDown is a Microsoft library that converts many file formats to Markdown,
16  including PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx), HTML, images,
17  audio, and more. All processing is performed locally.
18  
19  ### Usage example
20  
21  ```python
22  from haystack_integrations.components.converters.markitdown import MarkItDownConverter
23  
24  converter = MarkItDownConverter()
25  result = converter.run(sources=["document.pdf", "report.docx"])
26  documents = result["documents"]
27  ```
28  
29  #### __init__
30  
31  ```python
32  __init__(store_full_path: bool = False) -> None
33  ```
34  
35  Initializes the MarkItDownConverter.
36  
37  **Parameters:**
38  
39  - **store_full_path** (<code>bool</code>) – If `True`, the full file path is stored in the Document metadata.
40    If `False`, only the file name is stored. Defaults to `False`.
41  
42  #### run
43  
44  ```python
45  run(
46      sources: list[str | Path | ByteStream],
47      meta: dict[str, Any] | list[dict[str, Any]] | None = None,
48  ) -> dict[str, list[Document]]
49  ```
50  
51  Converts files to Documents using MarkItDown.
52  
53  **Parameters:**
54  
55  - **sources** (<code>list\[str | Path | ByteStream\]</code>) – List of file paths or ByteStream objects to convert.
56  - **meta** (<code>dict\[str, Any\] | list\[dict\[str, Any\]\] | None</code>) – Optional metadata to attach to the Documents. Can be a single dict
57    applied to all Documents, or a list of dicts aligned with `sources`.
58  
59  **Returns:**
60  
61  - <code>dict\[str, list\[Document\]\]</code> – A dictionary with key `documents` containing the converted Documents.