document_writers_api.md
1 --- 2 title: Document Writers 3 id: document-writers-api 4 description: Writes Documents to a DocumentStore. 5 slug: "/document-writers-api" 6 --- 7 8 <a id="document_writer"></a> 9 10 # Module document\_writer 11 12 <a id="document_writer.DocumentWriter"></a> 13 14 ## DocumentWriter 15 16 Writes documents to a DocumentStore. 17 18 ### Usage example 19 ```python 20 from haystack import Document 21 from haystack.components.writers import DocumentWriter 22 from haystack.document_stores.in_memory import InMemoryDocumentStore 23 docs = [ 24 Document(content="Python is a popular programming language"), 25 ] 26 doc_store = InMemoryDocumentStore() 27 writer = DocumentWriter(document_store=doc_store) 28 writer.run(docs) 29 ``` 30 31 <a id="document_writer.DocumentWriter.__init__"></a> 32 33 #### DocumentWriter.\_\_init\_\_ 34 35 ```python 36 def __init__(document_store: DocumentStore, 37 policy: DuplicatePolicy = DuplicatePolicy.NONE) 38 ``` 39 40 Create a DocumentWriter component. 41 42 **Arguments**: 43 44 - `document_store`: The instance of the document store where you want to store your documents. 45 - `policy`: The policy to apply when a Document with the same ID already exists in the DocumentStore. 46 - `DuplicatePolicy.NONE`: Default policy, relies on the DocumentStore settings. 47 - `DuplicatePolicy.SKIP`: Skips documents with the same ID and doesn't write them to the DocumentStore. 48 - `DuplicatePolicy.OVERWRITE`: Overwrites documents with the same ID. 49 - `DuplicatePolicy.FAIL`: Raises an error if a Document with the same ID is already in the DocumentStore. 50 51 <a id="document_writer.DocumentWriter.to_dict"></a> 52 53 #### DocumentWriter.to\_dict 54 55 ```python 56 def to_dict() -> dict[str, Any] 57 ``` 58 59 Serializes the component to a dictionary. 60 61 **Returns**: 62 63 Dictionary with serialized data. 64 65 <a id="document_writer.DocumentWriter.from_dict"></a> 66 67 #### DocumentWriter.from\_dict 68 69 ```python 70 @classmethod 71 def from_dict(cls, data: dict[str, Any]) -> "DocumentWriter" 72 ``` 73 74 Deserializes the component from a dictionary. 75 76 **Arguments**: 77 78 - `data`: The dictionary to deserialize from. 79 80 **Raises**: 81 82 - `DeserializationError`: If the document store is not properly specified in the serialization data or its type cannot be imported. 83 84 **Returns**: 85 86 The deserialized component. 87 88 <a id="document_writer.DocumentWriter.run"></a> 89 90 #### DocumentWriter.run 91 92 ```python 93 @component.output_types(documents_written=int) 94 def run(documents: list[Document], policy: Optional[DuplicatePolicy] = None) 95 ``` 96 97 Run the DocumentWriter on the given input data. 98 99 **Arguments**: 100 101 - `documents`: A list of documents to write to the document store. 102 - `policy`: The policy to use when encountering duplicate documents. 103 104 **Raises**: 105 106 - `ValueError`: If the specified document store is not found. 107 108 **Returns**: 109 110 Number of documents written to the document store. 111 112 <a id="document_writer.DocumentWriter.run_async"></a> 113 114 #### DocumentWriter.run\_async 115 116 ```python 117 @component.output_types(documents_written=int) 118 async def run_async(documents: list[Document], 119 policy: Optional[DuplicatePolicy] = None) 120 ``` 121 122 Asynchronously run the DocumentWriter on the given input data. 123 124 This is the asynchronous version of the `run` method. It has the same parameters and return values 125 but can be used with `await` in async code. 126 127 **Arguments**: 128 129 - `documents`: A list of documents to write to the document store. 130 - `policy`: The policy to use when encountering duplicate documents. 131 132 **Raises**: 133 134 - `ValueError`: If the specified document store is not found. 135 - `TypeError`: If the specified document store does not implement `write_documents_async`. 136 137 **Returns**: 138 139 Number of documents written to the document store.