deepevalevaluator.mdx
  1  ---
  2  title: "DeepEvalEvaluator"
  3  id: deepevalevaluator
  4  slug: "/deepevalevaluator"
  5  description: "The DeepEvalEvaluator evaluates Haystack pipelines using LLM-based metrics. It supports metrics like answer relevancy, faithfulness, contextual relevance, and more."
  6  ---
  7  
  8  # DeepEvalEvaluator
  9  
 10  The DeepEvalEvaluator evaluates Haystack pipelines using LLM-based metrics. It supports metrics like answer relevancy, faithfulness, contextual relevance, and more.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | On its own or in an evaluation pipeline. To be used after a separate pipeline has generated the inputs for the Evaluator. |
 17  | **Mandatory init variables** | `metric`: One of the DeepEval metrics to use for evaluation |
 18  | **Mandatory run variables** | `**inputs`: A keyword arguments dictionary containing the expected inputs. The expected inputs will change based on the metric you are evaluating. See below for more details. |
 19  | **Output variables** | `results`: A nested list of metric results. There can be one or more results, depending on the metric. Each result is a dictionary containing:  <br /> <br />- `name` - The name of the metric  <br />- `score` - The score of the metric  <br />- `explanation` - An optional explanation of the score |
 20  | **API reference** | [DeepEval](/reference/integrations-deepeval) |
 21  | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/deepeval |
 22  
 23  </div>
 24  
 25  DeepEval is an evaluation framework that provides a number of LLM-based evaluation metrics. You can use the `DeepEvalEvaluator` component to evaluate a Haystack pipeline, such as a retrieval-augmented generated pipeline, against one of the metrics provided by DeepEval.
 26  
 27  ## Supported Metrics
 28  
 29  DeepEval supports a number of metrics, which we expose through the [DeepEval metric enumeration.](/reference/integrations-deepeval#deepevalmetric) [`DeepEvalEvaluator`](/reference/integrations-deepeval#deepevalevaluator) in Haystack supports the metrics listed below with the expected `metric_params` while initializing the Evaluator. Many metrics use OpenAI models and require you to set an environment variable `OPENAI_API_KEY`. For a complete guide on these metrics, visit the [DeepEval documentation](https://docs.confident-ai.com/docs/getting-started).
 30  
 31  <div className="key-value-table">
 32  
 33  |  |  |
 34  | --- | --- |
 35  | **Most common position in a pipeline** | On its own or in an evaluation pipeline. To be used after a separate pipeline has generated the inputs for the Evaluator. |
 36  | **Mandatory init variables** | `metric`: One of the DeepEval metrics to use for evaluation |
 37  | **Mandatory run variables** | “\*\*inputs”: A keyword arguments dictionary containing the expected inputs. The expected inputs will change based on the metric you are evaluating. See below for more details. |
 38  | **Output variables** | `results`: A nested list of metric results. There can be one or more results, depending on the metric. Each result is a dictionary containing:  <br /> <br />- `name` - The name of the metric  <br />- `score` - The score of the metric  <br />- `explanation` - An optional explanation of the score |
 39  | **API reference** | [DeepEval](/reference/integrations-deepeval) |
 40  | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/deepeval |
 41  
 42  </div>
 43  
 44  ## Parameters Overview
 45  
 46  To initialize a `DeepEvalEvaluator`, you need to provide the following parameters :
 47  
 48  - `metric`: A `DeepEvalMetric`.
 49  - `metric_params`: Optionally, if the metric calls for any additional parameters, you should provide them here.
 50  
 51  ## Usage
 52  
 53  To use the `DeepEvalEvaluator`, you first need to install the integration:
 54  
 55  ```bash
 56  pip install deepeval-haystack
 57  ```
 58  
 59  To use the `DeepEvalEvaluator` you need to follow these steps:
 60  
 61  1. Initialize the `DeepEvalEvaluator` while providing the correct `metric_params` for the metric you are using.
 62  2. Run the `DeepEvalEvaluator` on its own or in a pipeline by providing the expected input for the metric you are using.
 63  
 64  ### Examples
 65  
 66  **Evaluate Faithfulness**
 67  
 68  To create a faithfulness evaluation pipeline:
 69  
 70  ```python
 71  from haystack import Pipeline
 72  from haystack_integrations.components.evaluators.deepeval import (
 73      DeepEvalEvaluator,
 74      DeepEvalMetric,
 75  )
 76  
 77  pipeline = Pipeline()
 78  evaluator = DeepEvalEvaluator(
 79      metric=DeepEvalMetric.FAITHFULNESS,
 80      metric_params={"model": "gpt-4"},
 81  )
 82  pipeline.add_component("evaluator", evaluator)
 83  ```
 84  
 85  To run the evaluation pipeline, you should have the _expected inputs_ for the metric ready at hand. This metric expects a list of `questions` and `contexts`. These should come from the results of the pipeline you want to evaluate.
 86  
 87  ```python
 88  results = pipeline.run(
 89      {
 90          "evaluator": {
 91              "questions": [
 92                  "When was the Rhodes Statue built?",
 93                  "Where is the Pyramid of Giza?",
 94              ],
 95              "contexts": [["Context for question 1"], ["Context for question 2"]],
 96              "responses": ["Response for question 1", "response for question 2"],
 97          },
 98      },
 99  )
100  ```
101  
102  ## Additional References
103  
104  🧑‍🍳 Cookbook: [RAG Pipeline Evaluation Using DeepEval](https://haystack.deepset.ai/cookbook/rag_eval_deep_eval)