/ docs-website / versioned_docs / version-2.21 / pipeline-components / converters / csvtodocument.mdx
csvtodocument.mdx
1 --- 2 title: "CSVToDocument" 3 id: csvtodocument 4 slug: "/csvtodocument" 5 description: "Converts CSV files to documents." 6 --- 7 8 # CSVToDocument 9 10 Converts CSV files to documents. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **Most common position in a pipeline** | Before [PreProcessors](../preprocessors.mdx) , or right at the beginning of an indexing pipeline | 17 | **Mandatory run variables** | `sources`: A list of file paths or [ByteStream](../../concepts/data-classes.mdx#bytestream) objects | 18 | **Output variables** | `documents`: A list of documents | 19 | **API reference** | [Converters](/reference/converters-api) | 20 | **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/converters/csv.py | 21 22 </div> 23 24 ## Overview 25 26 `CSVToDocument` converts one or more CSV files into a text document. 27 28 The component uses UTF-8 encoding by default, but you may specify a different encoding if needed during initialization. 29 You can optionally attach metadata to each document with a `meta` parameter when running the component. 30 31 ## Usage 32 33 ### On its own 34 35 ```python 36 from haystack.components.converters.csv import CSVToDocument 37 38 converter = CSVToDocument() 39 results = converter.run( 40 sources=["sample.csv"], 41 meta={"date_added": datetime.now().isoformat()}, 42 ) 43 documents = results["documents"] 44 45 print(documents[0].content) 46 ## 'col1,col2\now1,row1\nrow2row2\n' 47 ``` 48 49 ### In a pipeline 50 51 ```python 52 from haystack import Pipeline 53 from haystack.document_stores.in_memory import InMemoryDocumentStore 54 from haystack.components.converters import CSVToDocument 55 from haystack.components.preprocessors import DocumentCleaner 56 from haystack.components.preprocessors import DocumentSplitter 57 from haystack.components.writers import DocumentWriter 58 59 document_store = InMemoryDocumentStore() 60 61 pipeline = Pipeline() 62 pipeline.add_component("converter", CSVToDocument()) 63 pipeline.add_component("cleaner", DocumentCleaner()) 64 pipeline.add_component( 65 "splitter", 66 DocumentSplitter(split_by="sentence", split_length=5), 67 ) 68 pipeline.add_component("writer", DocumentWriter(document_store=document_store)) 69 pipeline.connect("converter", "cleaner") 70 pipeline.connect("cleaner", "splitter") 71 pipeline.connect("splitter", "writer") 72 73 pipeline.run({"converter": {"sources": file_names}}) 74 ```