fastembedranker.mdx
  1  ---
  2  title: "FastembedRanker"
  3  id: fastembedranker
  4  slug: "/fastembedranker"
  5  description: "Use this component to rank documents based on their similarity to the query using cross-encoder models supported by FastEmbed."
  6  ---
  7  
  8  # FastembedRanker
  9  
 10  Use this component to rank documents based on their similarity to the query using cross-encoder models supported by FastEmbed.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | In a query pipeline, after a component that returns a list of documents such as a [Retriever](../retrievers.mdx)  |
 17  | **Mandatory run variables** | `documents`: A list of documents  <br /> <br />`query`: A query string |
 18  | **Output variables** | `documents`: A list of documents |
 19  | **API reference** | [FastEmbed](/reference/fastembed-embedders) |
 20  | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/fastembed |
 21  
 22  </div>
 23  
 24  ## Overview
 25  
 26  `FastembedRanker` ranks the documents based on how similar they are to the query.  It uses [cross-encoder models supported by FastEmbed](https://qdrant.github.io/fastembed/examples/Supported_Models/).
 27  Based on ONXX Runtime, FastEmbed provides a fast experience on standard CPU machines.
 28  
 29  `FastembedRanker` is most useful in query pipelines such as a retrieval-augmented generation (RAG) pipeline or a document search pipeline to ensure the retrieved documents are ordered by relevance. You can use it after a Retriever (such as the [`InMemoryEmbeddingRetriever`](../retrievers/inmemoryembeddingretriever.mdx)) to improve the search results. When using `FastembedRanker` with a Retriever, consider setting the Retriever's `top_k` to a small number. This way, the Ranker will have fewer documents to process, which can help make your pipeline faster.
 30  
 31  By default, this component uses the `Xenova/ms-marco-MiniLM-L-6-v2` model, but you can switch to a different model by adjusting the `model` parameter when initializing the Ranker. For details on different initialization settings, check out the [API reference](/reference/fastembed-embedders) page.
 32  
 33  ### Compatible Models
 34  
 35  You can find the compatible models in the [FastEmbed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
 36  
 37  ### Installation
 38  
 39  To start using this integration with Haystack, install the package with:
 40  
 41  ```shell
 42  pip install fastembed-haystack
 43  ```
 44  
 45  ### Parameters
 46  
 47  You can set the path where the model is stored in a cache directory. You can also set the number of threads a single `onnxruntime` session can use.
 48  
 49  ```python
 50  cache_dir = "/your_cacheDirectory"
 51  ranker = FastembedRanker(
 52      model="Xenova/ms-marco-MiniLM-L-6-v2",
 53      cache_dir=cache_dir,
 54      threads=2,
 55  )
 56  ```
 57  
 58  If you want to use the data parallel encoding, you can set the parameters `parallel` and `batch_size`.
 59  
 60  - If `parallel` > 1, data-parallel encoding will be used. This is recommended for offline encoding of large datasets.
 61  - If `parallel` is 0, use all available cores.
 62  - If None, don't use data-parallel processing; use default `onnxruntime` threading instead.
 63  
 64  ## Usage
 65  
 66  ### On its own
 67  
 68  This example uses `FastembedRanker` to rank two simple documents. To run the Ranker, pass a `query`, provide the `documents`, and set the number of documents to return in the `top_k` parameter.
 69  
 70  ```python
 71  from haystack import Document
 72  from haystack_integrations.components.rankers.fastembed import FastembedRanker
 73  
 74  docs = [Document(content="Paris"), Document(content="Berlin")]
 75  
 76  ranker = FastembedRanker()
 77  ranker.warm_up()
 78  
 79  ranker.run(query="City in France", documents=docs, top_k=1)
 80  ```
 81  
 82  ### In a pipeline
 83  
 84  Below is an example of a pipeline that retrieves documents from an `InMemoryDocumentStore` based on keyword search using `InMemoryBM25Retriever`. It then uses the `FastembedRanker` to rank the retrieved documents according to their similarity to the query. The pipeline uses the default settings of the Ranker.
 85  
 86  ```python
 87  from haystack import Document, Pipeline
 88  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 89  from haystack.document_stores.in_memory import InMemoryDocumentStore
 90  from haystack_integrations.components.rankers.fastembed import FastembedRanker
 91  
 92  docs = [
 93      Document(content="Paris is in France"),
 94      Document(content="Berlin is in Germany"),
 95      Document(content="Lyon is in France"),
 96  ]
 97  document_store = InMemoryDocumentStore()
 98  document_store.write_documents(docs)
 99  
100  retriever = InMemoryBM25Retriever(document_store=document_store)
101  ranker = FastembedRanker()
102  
103  document_ranker_pipeline = Pipeline()
104  document_ranker_pipeline.add_component(instance=retriever, name="retriever")
105  document_ranker_pipeline.add_component(instance=ranker, name="ranker")
106  
107  document_ranker_pipeline.connect("retriever.documents", "ranker.documents")
108  
109  query = "Cities in France"
110  res = document_ranker_pipeline.run(
111      data={
112          "retriever": {"query": query, "top_k": 3},
113          "ranker": {"query": query, "top_k": 2},
114      },
115  )
116  ```