Cradicle Explorer

/ docs-website / versioned_docs / version-2.28 / pipeline-components / rankers / lostinthemiddleranker.mdx
lostinthemiddleranker.mdx
  1  ---
  2  title: "LostInTheMiddleRanker"
  3  id: lostinthemiddleranker
  4  slug: "/lostinthemiddleranker"
  5  description: "This Ranker positions the most relevant documents at the beginning and at the end of the resulting list while placing the least relevant Documents in the middle."
  6  ---
  7  
  8  # LostInTheMiddleRanker
  9  
 10  This Ranker positions the most relevant documents at the beginning and at the end of the resulting list while placing the least relevant Documents in the middle.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | In a query pipeline, after a component that returns a list of documents (such as a [Retriever](../retrievers.mdx) ) |
 17  | **Mandatory run variables**            | `documents`: A list of documents                                                                                   |
 18  | **Output variables**                   | `documents`: A list of documents                                                                                   |
 19  | **API reference**                      | [Rankers](/reference/rankers-api)                                                                                         |
 20  | **GitHub link**                        | https://github.com/deepset-ai/haystack/blob/main/haystack/components/rankers/lost_in_the_middle.py               |
 21  
 22  </div>
 23  
 24  ## Overview
 25  
 26  The `LostInTheMiddleRanker` reorders the documents based on the "Lost in the Middle" order, described in the ["Lost in the Middle: How Language Models Use Long Contexts"](https://arxiv.org/abs/2307.03172) research paper. It aims to lay out paragraphs into LLM context so that the relevant paragraphs are at the beginning or end of the input context, while the least relevant information is in the middle of the context. This reordering is helpful when very long contexts are sent to an LLM, as current models pay more attention to the start and end of long input contexts.
 27  
 28  In contrast to other rankers, `LostInTheMiddleRanker` assumes that the input documents are already sorted by relevance, and it doesn’t require a query as input. It is typically used as the last component before building a prompt for an LLM to prepare the input context for the LLM.
 29  
 30  ### Parameters
 31  
 32  If you specify the `word_count_threshold` when running the component, the Ranker includes all documents up until the point where adding another document would exceed the given threshold. The last document that exceeds the threshold will be included in the resulting list of Documents, but all following documents will be discarded.
 33  
 34  You can also specify the `top_k` parameter to set the maximum number of documents to return.
 35  
 36  ## Usage
 37  
 38  ### On its own
 39  
 40  ```python
 41  from haystack import Document
 42  from haystack.components.rankers import LostInTheMiddleRanker
 43  
 44  ranker = LostInTheMiddleRanker()
 45  docs = [
 46      Document(content="Paris"),
 47      Document(content="Berlin"),
 48      Document(content="Madrid"),
 49  ]
 50  result = ranker.run(documents=docs)
 51  
 52  for doc in result["documents"]:
 53      print(doc.content)
 54  ```
 55  
 56  ### In a pipeline
 57  
 58  Note that this example requires an OpenAI key to run.
 59  
 60  ```python
 61  from haystack import Document, Pipeline
 62  from haystack.document_stores.in_memory import InMemoryDocumentStore
 63  from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
 64  from haystack.components.rankers import LostInTheMiddleRanker
 65  from haystack.components.generators.chat import OpenAIChatGenerator
 66  from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
 67  from haystack.dataclasses import ChatMessage
 68  
 69  ## Define prompt template
 70  prompt_template = [
 71      ChatMessage.from_system("You are a helpful assistant."),
 72      ChatMessage.from_user(
 73          "Given these documents, answer the question.\nDocuments:\n"
 74          "{% for doc in documents %}{{ doc.content }}{% endfor %}\n"
 75          "Question: {{query}}\nAnswer:",
 76      ),
 77  ]
 78  
 79  ## Define documents
 80  docs = [
 81      Document(content="Paris is in France..."),
 82      Document(content="Berlin is in Germany..."),
 83      Document(content="Lyon is in France..."),
 84  ]
 85  
 86  document_store = InMemoryDocumentStore()
 87  document_store.write_documents(docs)
 88  
 89  retriever = InMemoryBM25Retriever(document_store=document_store)
 90  ranker = LostInTheMiddleRanker(word_count_threshold=1024)
 91  prompt_builder = ChatPromptBuilder(
 92      template=prompt_template,
 93      required_variables={"query", "documents"},
 94  )
 95  generator = OpenAIChatGenerator()
 96  
 97  p = Pipeline()
 98  p.add_component(instance=retriever, name="retriever")
 99  p.add_component(instance=ranker, name="ranker")
100  p.add_component(instance=prompt_builder, name="prompt_builder")
101  p.add_component(instance=generator, name="llm")
102  
103  p.connect("retriever.documents", "ranker.documents")
104  p.connect("ranker.documents", "prompt_builder.documents")
105  p.connect("prompt_builder.messages", "llm.messages")
106  
107  p.run(
108      {
109          "retriever": {"query": "What cities are in France?", "top_k": 3},
110          "prompt_builder": {"query": "What cities are in France?"},
111      },
112  )
113  ```