Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.25 / integrations-api / ragas.md
ragas.md
  1  ---
  2  title: "Ragas"
  3  id: integrations-ragas
  4  description: "Ragas integration for Haystack"
  5  slug: "/integrations-ragas"
  6  ---
  7  
  8  
  9  ## haystack_integrations.components.evaluators.ragas.evaluator
 10  
 11  ### RagasEvaluator
 12  
 13  A component that uses the Ragas framework to evaluate inputs against specified Ragas metrics.
 14  
 15  See the [Ragas framework](https://docs.ragas.io/) for more details.
 16  
 17  This component supports the modern Ragas metrics API (`ragas.metrics.collections`).
 18  Each metric must be a `SimpleBaseMetric` instance with its LLM configured at construction time.
 19  
 20  Usage example:
 21  
 22  ```python
 23  from openai import AsyncOpenAI
 24  from ragas.llms import llm_factory
 25  from ragas.metrics.collections import Faithfulness
 26  from haystack_integrations.components.evaluators.ragas import RagasEvaluator
 27  
 28  client = AsyncOpenAI()
 29  llm = llm_factory("gpt-4o-mini", client=client)
 30  
 31  evaluator = RagasEvaluator(
 32      ragas_metrics=[Faithfulness(llm=llm)],
 33  )
 34  output = evaluator.run(
 35      query="Which is the most popular global sport?",
 36      documents=[
 37          "Football is undoubtedly the world's most popular sport with"
 38          " major events like the FIFA World Cup and sports personalities"
 39          " like Ronaldo and Messi, drawing a followership of more than 4"
 40          " billion people."
 41      ],
 42      reference="Football is the most popular sport with around 4 billion"
 43                " followers worldwide",
 44  )
 45  
 46  output['result']
 47  ```
 48  
 49  #### __init__
 50  
 51  ```python
 52  __init__(ragas_metrics: list[SimpleBaseMetric]) -> None
 53  ```
 54  
 55  Constructs a new Ragas evaluator.
 56  
 57  **Parameters:**
 58  
 59  - **ragas_metrics** (<code>list\[SimpleBaseMetric\]</code>) – A list of modern Ragas metrics from `ragas.metrics.collections`.
 60    Each metric must be fully configured (including its LLM) at construction time.
 61    Available metrics can be found in the
 62    [Ragas documentation](https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/).
 63  
 64  #### to_dict
 65  
 66  ```python
 67  to_dict() -> dict[str, Any]
 68  ```
 69  
 70  Serialize this component to a dictionary.
 71  
 72  **Returns:**
 73  
 74  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 75  
 76  #### from_dict
 77  
 78  ```python
 79  from_dict(data: dict[str, Any]) -> RagasEvaluator
 80  ```
 81  
 82  Deserialize this component from a dictionary.
 83  
 84  Metrics are reconstructed from their stored class path and LLM/embedding
 85  configuration. Only the `openai` provider is supported for automatic
 86  deserialization; the API key is read from the `OPENAI_API_KEY` environment
 87  variable at load time.
 88  
 89  **Parameters:**
 90  
 91  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 92  
 93  **Returns:**
 94  
 95  - <code>RagasEvaluator</code> – Deserialized component.
 96  
 97  #### run
 98  
 99  ```python
100  run(
101      query: str | None = None,
102      response: list[ChatMessage] | str | None = None,
103      documents: list[Document | str] | None = None,
104      reference_contexts: list[str] | None = None,
105      multi_responses: list[str] | None = None,
106      reference: str | None = None,
107      rubrics: dict[str, str] | None = None,
108  ) -> dict[str, dict[str, MetricResult]]
109  ```
110  
111  Evaluates the provided inputs against each metric and returns the results.
112  
113  **Parameters:**
114  
115  - **query** (<code>str | None</code>) – The input query from the user.
116  - **response** (<code>list\[ChatMessage\] | str | None</code>) – A list of ChatMessage responses (typically from a language model or agent).
117  - **documents** (<code>list\[Document | str\] | None</code>) – A list of Haystack Document or strings that were retrieved for the query.
118  - **reference_contexts** (<code>list\[str\] | None</code>) – A list of reference contexts that should have been retrieved for the query.
119  - **multi_responses** (<code>list\[str\] | None</code>) – List of multiple responses generated for the query.
120  - **reference** (<code>str | None</code>) – A string reference answer for the query.
121  - **rubrics** (<code>dict\[str, str\] | None</code>) – A dictionary of evaluation rubric, where keys represent the score
122    and the values represent the corresponding evaluation criteria.
123  
124  **Returns:**
125  
126  - <code>dict\[str, dict\[str, MetricResult\]\]</code> – A dictionary with key `result` mapping metric names to their `MetricResult`.