Cradicle Explorer

/ docs-website / versioned_docs / version-2.24 / pipeline-components / rankers / jinaranker.mdx
jinaranker.mdx
  1  ---
  2  title: "JinaRanker"
  3  id: jinaranker
  4  slug: "/jinaranker"
  5  description: "Use this component to rank documents based on their similarity to the query using Jina AI models."
  6  ---
  7  
  8  # JinaRanker
  9  
 10  Use this component to rank documents based on their similarity to the query using Jina AI models.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | In a query pipeline, after a component that returns a list of documents (such as a [Retriever](../retrievers.mdx) ) |
 17  | **Mandatory init variables** | `api_key`: The Jina API key. Can be set with `JINA_API_KEY` env var. |
 18  | **Mandatory run variables** | `query`: A query string  <br /> <br />`documents`: A list of documents |
 19  | **Output variables** | `documents`: A list of documents |
 20  | **API reference** | [Jina](/reference/integrations-jina) |
 21  | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/jina |
 22  
 23  </div>
 24  
 25  ## Overview
 26  
 27  `JinaRanker` ranks the given documents based on how similar they are to the given query. It uses Jina AI ranking models – check out the full list at Jina AI’s [website](https://jina.ai/reranker/). The default model for this Ranker is `jina-reranker-v1-base-en`.
 28  
 29  Additionally, you can use the optional `top_k` and `score_threshold` parameters with `JinaRanker` :
 30  
 31  - The Ranker's `top_k` is the number of documents it returns (if it's the last component in the pipeline) or forwards to the next component.
 32  - If you set the `score_threshold` for the Ranker, it will only return documents with a similarity score (computed by the Jina AI model) above this threshold.
 33  
 34  ### Installation
 35  
 36  To start using this integration with Haystack, install the package with:
 37  
 38  ```shell
 39  pip install jina-haystack
 40  ```
 41  
 42  ### Authorization
 43  
 44  The component uses a `JINA_API_KEY` environment variable by default. Otherwise, you can pass a Jina API key at initialization with `api_key` like this:
 45  
 46  ```python
 47  ranker = JinaRanker(api_key=Secret.from_token("<your-api-key>"))
 48  ```
 49  
 50  To get your API key, head to Jina AI’s [website](https://jina.ai/reranker/).
 51  
 52  ## Usage
 53  
 54  ### On its own
 55  
 56  You can use `JinaRanker` outside of a pipeline to order documents based on your query.
 57  
 58  To run the Ranker, pass a query, provide the documents, and set the number of documents to return in the `top_k` parameter.
 59  
 60  ```python
 61  from haystack import Document
 62  from haystack_integrations.components.rankers.jina import JinaRanker
 63  
 64  docs = [Document(content="Paris"), Document(content="Berlin")]
 65  
 66  ranker = JinaRanker()
 67  
 68  ranker.run(query="City in France", documents=docs, top_k=1)
 69  ```
 70  
 71  ### In a pipeline
 72  
 73  This is an example of a pipeline that retrieves documents from an `InMemoryDocumentStore` based on keyword search (using `InMemoryBM25Retriever`). It then uses the `JinaRanker` to rank the retrieved documents according to their similarity to the query.
 74  
 75  ```python
 76  from haystack import Document, Pipeline
 77  from haystack.document_stores.in_memory import InMemoryDocumentStore
 78  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 79  from haystack_integrations.components.rankers.jina import JinaRanker
 80  
 81  docs = [
 82      Document(content="Paris is in France"),
 83      Document(content="Berlin is in Germany"),
 84      Document(content="Lyon is in France"),
 85  ]
 86  document_store = InMemoryDocumentStore()
 87  document_store.write_documents(docs)
 88  
 89  retriever = InMemoryBM25Retriever(document_store=document_store)
 90  ranker = JinaRanker()
 91  
 92  ranker_pipeline = Pipeline()
 93  ranker_pipeline.add_component(instance=retriever, name="retriever")
 94  ranker_pipeline.add_component(instance=ranker, name="ranker")
 95  
 96  ranker_pipeline.connect("retriever.documents", "ranker.documents")
 97  
 98  query = "Cities in France"
 99  ranker_pipeline.run(
100      data={
101          "retriever": {"query": query, "top_k": 3},
102          "ranker": {"query": query, "top_k": 2},
103      },
104  )
105  ```