qdrantembeddingretriever.mdx
1 --- 2 title: "QdrantEmbeddingRetriever" 3 id: qdrantembeddingretriever 4 slug: "/qdrantembeddingretriever" 5 description: "An embedding-based Retriever compatible with the Qdrant Document Store." 6 --- 7 8 # QdrantEmbeddingRetriever 9 10 An embedding-based Retriever compatible with the Qdrant Document Store. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **Most common position in a pipeline** | 1\. After a Text Embedder and before a [`PromptBuilder`](../builders/promptbuilder.mdx) in a RAG Pipeline <br /> <br />2. The last component in the semantic search pipeline <br />3. After a Text Embedder and before an [`ExtractiveReader`](../readers/extractivereader.mdx) in an extractive QA pipeline | 17 | **Mandatory init variables** | `document_store`: An instance of a [QdrantDocumentStore](../../document-stores/qdrant-document-store.mdx) | 18 | **Mandatory run variables** | `query_embedding`: A vector representing the query (a list of floats) | 19 | **Output variables** | `documents`: A list of documents | 20 | **API reference** | [Qdrant](/reference/integrations-qdrant) | 21 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/qdrant | 22 23 </div> 24 25 ## Overview 26 27 The `QdrantEmbeddingRetriever` is an embedding-based Retriever compatible with the `QdrantDocumentStore`. It compares the query and Document embeddings and fetches the Documents most relevant to the query from the `QdrantDocumentStore` based on the outcome. 28 29 When using the `QdrantEmbeddingRetriever` in your NLP system, make sure it has the query and Document embeddings available. You can add a Document Embedder to your indexing Pipeline and a Text Embedder to your query Pipeline. 30 31 In addition to the `query_embedding`, the `QdrantEmbeddingRetriever` accepts other optional parameters, including `top_k` (the maximum number of Documents to retrieve) and `filters` to narrow down the search space. 32 33 Some relevant parameters that impact the embedding retrieval must be defined when the corresponding `QdrantDocumentStore` is initialized: these include the embedding dimension (`embedding_dim`), the `similarity` function to use when comparing embeddings and the HNWS configuration (`hnsw_config`). 34 35 ### Installation 36 37 To start using Qdrant with Haystack, first install the package with: 38 39 ```shell 40 pip install qdrant-haystack 41 ``` 42 43 ### Usage 44 45 #### On its own 46 47 This Retriever needs the `QdrantDocumentStore` and indexed Documents to run. 48 49 ```python 50 from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever 51 from haystack_integrations.document_stores.qdrant import QdrantDocumentStore 52 53 document_store = QdrantDocumentStore( 54 ":memory:", 55 recreate_index=True, 56 return_embedding=True, 57 wait_result_from_api=True, 58 ) 59 retriever = QdrantEmbeddingRetriever(document_store=document_store) 60 61 ## using a fake vector to keep the example simple 62 retriever.run(query_embedding=[0.1] * 768) 63 ``` 64 65 #### In a Pipeline 66 67 ```python 68 from haystack.document_stores.types import DuplicatePolicy 69 from haystack import Document 70 from haystack import Pipeline 71 from haystack.components.embedders import ( 72 SentenceTransformersTextEmbedder, 73 SentenceTransformersDocumentEmbedder, 74 ) 75 76 from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever 77 from haystack_integrations.document_stores.qdrant import QdrantDocumentStore 78 79 document_store = QdrantDocumentStore( 80 ":memory:", 81 recreate_index=True, 82 return_embedding=True, 83 wait_result_from_api=True, 84 ) 85 86 documents = [ 87 Document(content="There are over 7,000 languages spoken around the world today."), 88 Document( 89 content="Elephants have been observed to behave in a way that indicates a high level of self-awareness, such as recognizing themselves in mirrors.", 90 ), 91 Document( 92 content="In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the phenomenon of bioluminescent waves.", 93 ), 94 ] 95 96 document_embedder = SentenceTransformersDocumentEmbedder() 97 documents_with_embeddings = document_embedder.run(documents) 98 99 document_store.write_documents( 100 documents_with_embeddings.get("documents"), 101 policy=DuplicatePolicy.OVERWRITE, 102 ) 103 104 query_pipeline = Pipeline() 105 query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder()) 106 query_pipeline.add_component( 107 "retriever", 108 QdrantEmbeddingRetriever(document_store=document_store), 109 ) 110 query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding") 111 112 query = "How many languages are there?" 113 114 result = query_pipeline.run({"text_embedder": {"text": query}}) 115 116 print(result["retriever"]["documents"][0]) 117 ```