image_converters_api.md
1 --- 2 title: "Image Converters" 3 id: image-converters-api 4 description: "Various converters to transform image data from one format to another." 5 slug: "/image-converters-api" 6 --- 7 8 <a id="document_to_image"></a> 9 10 ## Module document\_to\_image 11 12 <a id="document_to_image.DocumentToImageContent"></a> 13 14 ### DocumentToImageContent 15 16 Converts documents sourced from PDF and image files into ImageContents. 17 18 This component processes a list of documents and extracts visual content from supported file formats, converting 19 them into ImageContents that can be used for multimodal AI tasks. It handles both direct image files and PDF 20 documents by extracting specific pages as images. 21 22 Documents are expected to have metadata containing: 23 - The `file_path_meta_field` key with a valid file path that exists when combined with `root_path` 24 - A supported image format (MIME type must be one of the supported image types) 25 - For PDF files, a `page_number` key specifying which page to extract 26 27 ### Usage example 28 ```python 29 from haystack import Document 30 from haystack.components.converters.image.document_to_image import DocumentToImageContent 31 32 converter = DocumentToImageContent( 33 file_path_meta_field="file_path", 34 root_path="/data/files", 35 detail="high", 36 size=(800, 600) 37 ) 38 39 documents = [ 40 Document(content="Optional description of image.jpg", meta={"file_path": "image.jpg"}), 41 Document(content="Text content of page 1 of doc.pdf", meta={"file_path": "doc.pdf", "page_number": 1}) 42 ] 43 44 result = converter.run(documents) 45 image_contents = result["image_contents"] 46 # [ImageContent( 47 # base64_image='/9j/4A...', mime_type='image/jpeg', detail='high', meta={'file_path': 'image.jpg'} 48 # ), 49 # ImageContent( 50 # base64_image='/9j/4A...', mime_type='image/jpeg', detail='high', 51 # meta={'page_number': 1, 'file_path': 'doc.pdf'} 52 # )] 53 ``` 54 55 <a id="document_to_image.DocumentToImageContent.__init__"></a> 56 57 #### DocumentToImageContent.\_\_init\_\_ 58 59 ```python 60 def __init__(*, 61 file_path_meta_field: str = "file_path", 62 root_path: Optional[str] = None, 63 detail: Optional[Literal["auto", "high", "low"]] = None, 64 size: Optional[tuple[int, int]] = None) 65 ``` 66 67 Initialize the DocumentToImageContent component. 68 69 **Arguments**: 70 71 - `file_path_meta_field`: The metadata field in the Document that contains the file path to the image or PDF. 72 - `root_path`: The root directory path where document files are located. If provided, file paths in 73 document metadata will be resolved relative to this path. If None, file paths are treated as absolute paths. 74 - `detail`: Optional detail level of the image (only supported by OpenAI). Can be "auto", "high", or "low". 75 This will be passed to the created ImageContent objects. 76 - `size`: If provided, resizes the image to fit within the specified dimensions (width, height) while 77 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 78 when working with models that have resolution constraints or when transmitting images to remote services. 79 80 <a id="document_to_image.DocumentToImageContent.run"></a> 81 82 #### DocumentToImageContent.run 83 84 ```python 85 @component.output_types(image_contents=list[Optional[ImageContent]]) 86 def run(documents: list[Document]) -> dict[str, list[Optional[ImageContent]]] 87 ``` 88 89 Convert documents with image or PDF sources into ImageContent objects. 90 91 This method processes the input documents, extracting images from supported file formats and converting them 92 into ImageContent objects. 93 94 **Arguments**: 95 96 - `documents`: A list of documents to process. Each document should have metadata containing at minimum 97 a 'file_path_meta_field' key. PDF documents additionally require a 'page_number' key to specify which 98 page to convert. 99 100 **Raises**: 101 102 - `ValueError`: If any document is missing the required metadata keys, has an invalid file path, or has an unsupported 103 MIME type. The error message will specify which document and what information is missing or incorrect. 104 105 **Returns**: 106 107 Dictionary containing one key: 108 - "image_contents": ImageContents created from the processed documents. These contain base64-encoded image 109 data and metadata. The order corresponds to order of input documents. 110 111 <a id="file_to_document"></a> 112 113 ## Module file\_to\_document 114 115 <a id="file_to_document.ImageFileToDocument"></a> 116 117 ### ImageFileToDocument 118 119 Converts image file references into empty Document objects with associated metadata. 120 121 This component is useful in pipelines where image file paths need to be wrapped in `Document` objects to be 122 processed by downstream components such as the `SentenceTransformersImageDocumentEmbedder`. 123 124 It does **not** extract any content from the image files, instead it creates `Document` objects with `None` as 125 their content and attaches metadata such as file path and any user-provided values. 126 127 ### Usage example 128 ```python 129 from haystack.components.converters.image import ImageFileToDocument 130 131 converter = ImageFileToDocument() 132 133 sources = ["image.jpg", "another_image.png"] 134 135 result = converter.run(sources=sources) 136 documents = result["documents"] 137 138 print(documents) 139 140 # [Document(id=..., meta: {'file_path': 'image.jpg'}), 141 # Document(id=..., meta: {'file_path': 'another_image.png'})] 142 ``` 143 144 <a id="file_to_document.ImageFileToDocument.__init__"></a> 145 146 #### ImageFileToDocument.\_\_init\_\_ 147 148 ```python 149 def __init__(*, store_full_path: bool = False) 150 ``` 151 152 Initialize the ImageFileToDocument component. 153 154 **Arguments**: 155 156 - `store_full_path`: If True, the full path of the file is stored in the metadata of the document. 157 If False, only the file name is stored. 158 159 <a id="file_to_document.ImageFileToDocument.run"></a> 160 161 #### ImageFileToDocument.run 162 163 ```python 164 @component.output_types(documents=list[Document]) 165 def run( 166 *, 167 sources: list[Union[str, Path, ByteStream]], 168 meta: Optional[Union[dict[str, Any], list[dict[str, Any]]]] = None 169 ) -> dict[str, list[Document]] 170 ``` 171 172 Convert image files into empty Document objects with metadata. 173 174 This method accepts image file references (as file paths or ByteStreams) and creates `Document` objects 175 without content. These documents are enriched with metadata derived from the input source and optional 176 user-provided metadata. 177 178 **Arguments**: 179 180 - `sources`: List of file paths or ByteStream objects to convert. 181 - `meta`: Optional metadata to attach to the documents. 182 This value can be a list of dictionaries or a single dictionary. 183 If it's a single dictionary, its content is added to the metadata of all produced documents. 184 If it's a list, its length must match the number of sources, as they are zipped together. 185 For ByteStream objects, their `meta` is added to the output documents. 186 187 **Returns**: 188 189 A dictionary containing: 190 - `documents`: A list of `Document` objects with empty content and associated metadata. 191 192 <a id="file_to_image"></a> 193 194 ## Module file\_to\_image 195 196 <a id="file_to_image.ImageFileToImageContent"></a> 197 198 ### ImageFileToImageContent 199 200 Converts image files to ImageContent objects. 201 202 ### Usage example 203 ```python 204 from haystack.components.converters.image import ImageFileToImageContent 205 206 converter = ImageFileToImageContent() 207 208 sources = ["image.jpg", "another_image.png"] 209 210 image_contents = converter.run(sources=sources)["image_contents"] 211 print(image_contents) 212 213 # [ImageContent(base64_image='...', 214 # mime_type='image/jpeg', 215 # detail=None, 216 # meta={'file_path': 'image.jpg'}), 217 # ...] 218 ``` 219 220 <a id="file_to_image.ImageFileToImageContent.__init__"></a> 221 222 #### ImageFileToImageContent.\_\_init\_\_ 223 224 ```python 225 def __init__(*, 226 detail: Optional[Literal["auto", "high", "low"]] = None, 227 size: Optional[tuple[int, int]] = None) 228 ``` 229 230 Create the ImageFileToImageContent component. 231 232 **Arguments**: 233 234 - `detail`: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 235 This will be passed to the created ImageContent objects. 236 - `size`: If provided, resizes the image to fit within the specified dimensions (width, height) while 237 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 238 when working with models that have resolution constraints or when transmitting images to remote services. 239 240 <a id="file_to_image.ImageFileToImageContent.run"></a> 241 242 #### ImageFileToImageContent.run 243 244 ```python 245 @component.output_types(image_contents=list[ImageContent]) 246 def run(sources: list[Union[str, Path, ByteStream]], 247 meta: Optional[Union[dict[str, Any], list[dict[str, Any]]]] = None, 248 *, 249 detail: Optional[Literal["auto", "high", "low"]] = None, 250 size: Optional[tuple[int, 251 int]] = None) -> dict[str, list[ImageContent]] 252 ``` 253 254 Converts files to ImageContent objects. 255 256 **Arguments**: 257 258 - `sources`: List of file paths or ByteStream objects to convert. 259 - `meta`: Optional metadata to attach to the ImageContent objects. 260 This value can be a list of dictionaries or a single dictionary. 261 If it's a single dictionary, its content is added to the metadata of all produced ImageContent objects. 262 If it's a list, its length must match the number of sources as they're zipped together. 263 For ByteStream objects, their `meta` is added to the output ImageContent objects. 264 - `detail`: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 265 This will be passed to the created ImageContent objects. 266 If not provided, the detail level will be the one set in the constructor. 267 - `size`: If provided, resizes the image to fit within the specified dimensions (width, height) while 268 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 269 when working with models that have resolution constraints or when transmitting images to remote services. 270 If not provided, the size value will be the one set in the constructor. 271 272 **Returns**: 273 274 A dictionary with the following keys: 275 - `image_contents`: A list of ImageContent objects. 276 277 <a id="pdf_to_image"></a> 278 279 ## Module pdf\_to\_image 280 281 <a id="pdf_to_image.PDFToImageContent"></a> 282 283 ### PDFToImageContent 284 285 Converts PDF files to ImageContent objects. 286 287 ### Usage example 288 ```python 289 from haystack.components.converters.image import PDFToImageContent 290 291 converter = PDFToImageContent() 292 293 sources = ["file.pdf", "another_file.pdf"] 294 295 image_contents = converter.run(sources=sources)["image_contents"] 296 print(image_contents) 297 298 # [ImageContent(base64_image='...', 299 # mime_type='application/pdf', 300 # detail=None, 301 # meta={'file_path': 'file.pdf', 'page_number': 1}), 302 # ...] 303 ``` 304 305 <a id="pdf_to_image.PDFToImageContent.__init__"></a> 306 307 #### PDFToImageContent.\_\_init\_\_ 308 309 ```python 310 def __init__(*, 311 detail: Optional[Literal["auto", "high", "low"]] = None, 312 size: Optional[tuple[int, int]] = None, 313 page_range: Optional[list[Union[str, int]]] = None) 314 ``` 315 316 Create the PDFToImageContent component. 317 318 **Arguments**: 319 320 - `detail`: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 321 This will be passed to the created ImageContent objects. 322 - `size`: If provided, resizes the image to fit within the specified dimensions (width, height) while 323 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 324 when working with models that have resolution constraints or when transmitting images to remote services. 325 - `page_range`: List of page numbers and/or page ranges to convert to images. Page numbers start at 1. 326 If None, all pages in the PDF will be converted. Pages outside the valid range (1 to number of pages) 327 will be skipped with a warning. For example, page_range=[1, 3] will convert only the first and third 328 pages of the document. It also accepts printable range strings, e.g.: ['1-3', '5', '8', '10-12'] 329 will convert pages 1, 2, 3, 5, 8, 10, 11, 12. 330 331 <a id="pdf_to_image.PDFToImageContent.run"></a> 332 333 #### PDFToImageContent.run 334 335 ```python 336 @component.output_types(image_contents=list[ImageContent]) 337 def run( 338 sources: list[Union[str, Path, ByteStream]], 339 meta: Optional[Union[dict[str, Any], list[dict[str, Any]]]] = None, 340 *, 341 detail: Optional[Literal["auto", "high", "low"]] = None, 342 size: Optional[tuple[int, int]] = None, 343 page_range: Optional[list[Union[str, int]]] = None 344 ) -> dict[str, list[ImageContent]] 345 ``` 346 347 Converts files to ImageContent objects. 348 349 **Arguments**: 350 351 - `sources`: List of file paths or ByteStream objects to convert. 352 - `meta`: Optional metadata to attach to the ImageContent objects. 353 This value can be a list of dictionaries or a single dictionary. 354 If it's a single dictionary, its content is added to the metadata of all produced ImageContent objects. 355 If it's a list, its length must match the number of sources as they're zipped together. 356 For ByteStream objects, their `meta` is added to the output ImageContent objects. 357 - `detail`: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low". 358 This will be passed to the created ImageContent objects. 359 If not provided, the detail level will be the one set in the constructor. 360 - `size`: If provided, resizes the image to fit within the specified dimensions (width, height) while 361 maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial 362 when working with models that have resolution constraints or when transmitting images to remote services. 363 If not provided, the size value will be the one set in the constructor. 364 - `page_range`: List of page numbers and/or page ranges to convert to images. Page numbers start at 1. 365 If None, all pages in the PDF will be converted. Pages outside the valid range (1 to number of pages) 366 will be skipped with a warning. For example, page_range=[1, 3] will convert only the first and third 367 pages of the document. It also accepts printable range strings, e.g.: ['1-3', '5', '8', '10-12'] 368 will convert pages 1, 2, 3, 5, 8, 10, 11, 12. 369 If not provided, the page_range value will be the one set in the constructor. 370 371 **Returns**: 372 373 A dictionary with the following keys: 374 - `image_contents`: A list of ImageContent objects. 375