faissembeddingretriever.mdx
1 --- 2 title: "FAISSEmbeddingRetriever" 3 id: faissembeddingretriever 4 slug: "/faissembeddingretriever" 5 description: "An embedding-based Retriever compatible with the FAISSDocumentStore." 6 --- 7 8 # FAISSEmbeddingRetriever 9 10 An embedding-based Retriever compatible with the FAISSDocumentStore. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **Most common position in a pipeline** | 1. After a Text Embedder and before a [`PromptBuilder`](../builders/promptbuilder.mdx) in a RAG pipeline 2. The last component in a semantic search pipeline 3. After a Text Embedder and before an [`ExtractiveReader`](../readers/extractivereader.mdx) in an extractive QA pipeline | 17 | **Mandatory init variables** | `document_store`: An instance of a [`FAISSDocumentStore`](../../document-stores/faissdocumentstore.mdx) | 18 | **Mandatory run variables** | `query_embedding`: A vector representing the query (a list of floats) | 19 | **Output variables** | `documents`: A list of documents | 20 | **API reference** | [FAISS](/reference/integrations-faiss) | 21 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/faiss | 22 23 </div> 24 25 ## Overview 26 27 The `FAISSEmbeddingRetriever` is an embedding-based Retriever that queries a `FAISSDocumentStore`. It compares the query embedding to document embeddings stored in FAISS and returns the most similar documents. 28 29 This Retriever expects precomputed embeddings in the Document Store and a query embedding at runtime. You can generate them with a Document Embedder in your indexing pipeline and a Text Embedder in your query pipeline. 30 31 In addition to `query_embedding`, you can pass: 32 33 - `top_k`: The maximum number of documents to return. 34 - `filters`: Metadata filters to restrict retrieved documents. 35 36 You can also configure default filters and `filter_policy` at initialization. 37 38 ## Usage 39 40 ### On its own 41 42 ```python 43 from haystack_integrations.document_stores.faiss import FAISSDocumentStore 44 from haystack_integrations.components.retrievers.faiss import FAISSEmbeddingRetriever 45 46 document_store = FAISSDocumentStore(embedding_dim=768) 47 retriever = FAISSEmbeddingRetriever(document_store=document_store, top_k=5) 48 49 # Example query embedding 50 result = retriever.run(query_embedding=[0.1] * 768) 51 print(result["documents"]) 52 ``` 53 54 ### In a pipeline 55 56 ```python 57 from haystack import Document, Pipeline 58 from haystack.components.embedders import ( 59 SentenceTransformersDocumentEmbedder, 60 SentenceTransformersTextEmbedder, 61 ) 62 from haystack.document_stores.types import DuplicatePolicy 63 from haystack_integrations.document_stores.faiss import FAISSDocumentStore 64 from haystack_integrations.components.retrievers.faiss import FAISSEmbeddingRetriever 65 66 document_store = FAISSDocumentStore(embedding_dim=768) 67 68 documents = [ 69 Document(content="There are over 7,000 languages spoken around the world today."), 70 Document( 71 content="Elephants have been observed to behave in a way that indicates a high level of intelligence.", 72 ), 73 Document( 74 content="In certain places, you can witness the phenomenon of bioluminescent waves.", 75 ), 76 ] 77 78 document_embedder = SentenceTransformersDocumentEmbedder() 79 documents_with_embeddings = document_embedder.run(documents)["documents"] 80 document_store.write_documents( 81 documents_with_embeddings, 82 policy=DuplicatePolicy.OVERWRITE, 83 ) 84 85 query_pipeline = Pipeline() 86 query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder()) 87 query_pipeline.add_component( 88 "retriever", 89 FAISSEmbeddingRetriever(document_store=document_store), 90 ) 91 query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding") 92 93 query = "How many languages are there?" 94 result = query_pipeline.run({"text_embedder": {"text": query}}) 95 96 print(result["retriever"]["documents"][0]) 97 ```