/ docs-website / versioned_docs / version-2.27 / pipeline-components / evaluators / deepevalevaluator.mdx
deepevalevaluator.mdx
1 --- 2 title: "DeepEvalEvaluator" 3 id: deepevalevaluator 4 slug: "/deepevalevaluator" 5 description: "The DeepEvalEvaluator evaluates Haystack pipelines using LLM-based metrics. It supports metrics like answer relevancy, faithfulness, contextual relevance, and more." 6 --- 7 8 # DeepEvalEvaluator 9 10 The DeepEvalEvaluator evaluates Haystack pipelines using LLM-based metrics. It supports metrics like answer relevancy, faithfulness, contextual relevance, and more. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **Most common position in a pipeline** | On its own or in an evaluation pipeline. To be used after a separate pipeline has generated the inputs for the Evaluator. | 17 | **Mandatory init variables** | `metric`: One of the DeepEval metrics to use for evaluation | 18 | **Mandatory run variables** | `**inputs`: A keyword arguments dictionary containing the expected inputs. The expected inputs will change based on the metric you are evaluating. See below for more details. | 19 | **Output variables** | `results`: A nested list of metric results. There can be one or more results, depending on the metric. Each result is a dictionary containing: <br /> <br />- `name` - The name of the metric <br />- `score` - The score of the metric <br />- `explanation` - An optional explanation of the score | 20 | **API reference** | [DeepEval](/reference/integrations-deepeval) | 21 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/deepeval | 22 23 </div> 24 25 DeepEval is an evaluation framework that provides a number of LLM-based evaluation metrics. You can use the `DeepEvalEvaluator` component to evaluate a Haystack pipeline, such as a retrieval-augmented generated pipeline, against one of the metrics provided by DeepEval. 26 27 ## Supported Metrics 28 29 DeepEval supports a number of metrics, which we expose through the [DeepEval metric enumeration.](/reference/integrations-deepeval#deepevalmetric) [`DeepEvalEvaluator`](/reference/integrations-deepeval#deepevalevaluator) in Haystack supports the metrics listed below with the expected `metric_params` while initializing the Evaluator. Many metrics use OpenAI models and require you to set an environment variable `OPENAI_API_KEY`. For a complete guide on these metrics, visit the [DeepEval documentation](https://docs.confident-ai.com/docs/getting-started). 30 31 <div className="key-value-table"> 32 33 | | | 34 | --- | --- | 35 | **Most common position in a pipeline** | On its own or in an evaluation pipeline. To be used after a separate pipeline has generated the inputs for the Evaluator. | 36 | **Mandatory init variables** | `metric`: One of the DeepEval metrics to use for evaluation | 37 | **Mandatory run variables** | “\*\*inputs”: A keyword arguments dictionary containing the expected inputs. The expected inputs will change based on the metric you are evaluating. See below for more details. | 38 | **Output variables** | `results`: A nested list of metric results. There can be one or more results, depending on the metric. Each result is a dictionary containing: <br /> <br />- `name` - The name of the metric <br />- `score` - The score of the metric <br />- `explanation` - An optional explanation of the score | 39 | **API reference** | [DeepEval](/reference/integrations-deepeval) | 40 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/deepeval | 41 42 </div> 43 44 ## Parameters Overview 45 46 To initialize a `DeepEvalEvaluator`, you need to provide the following parameters : 47 48 - `metric`: A `DeepEvalMetric`. 49 - `metric_params`: Optionally, if the metric calls for any additional parameters, you should provide them here. 50 51 ## Usage 52 53 To use the `DeepEvalEvaluator`, you first need to install the integration: 54 55 ```bash 56 pip install deepeval-haystack 57 ``` 58 59 To use the `DeepEvalEvaluator` you need to follow these steps: 60 61 1. Initialize the `DeepEvalEvaluator` while providing the correct `metric_params` for the metric you are using. 62 2. Run the `DeepEvalEvaluator` on its own or in a pipeline by providing the expected input for the metric you are using. 63 64 ### Examples 65 66 **Evaluate Faithfulness** 67 68 To create a faithfulness evaluation pipeline: 69 70 ```python 71 from haystack import Pipeline 72 from haystack_integrations.components.evaluators.deepeval import ( 73 DeepEvalEvaluator, 74 DeepEvalMetric, 75 ) 76 77 pipeline = Pipeline() 78 evaluator = DeepEvalEvaluator( 79 metric=DeepEvalMetric.FAITHFULNESS, 80 metric_params={"model": "gpt-4"}, 81 ) 82 pipeline.add_component("evaluator", evaluator) 83 ``` 84 85 To run the evaluation pipeline, you should have the _expected inputs_ for the metric ready at hand. This metric expects a list of `questions` and `contexts`. These should come from the results of the pipeline you want to evaluate. 86 87 ```python 88 results = pipeline.run( 89 { 90 "evaluator": { 91 "questions": [ 92 "When was the Rhodes Statue built?", 93 "Where is the Pyramid of Giza?", 94 ], 95 "contexts": [["Context for question 1"], ["Context for question 2"]], 96 "responses": ["Response for question 1", "response for question 2"], 97 }, 98 }, 99 ) 100 ``` 101 102 ## Additional References 103 104 🧑🍳 Cookbook: [RAG Pipeline Evaluation Using DeepEval](https://haystack.deepset.ai/cookbook/rag_eval_deep_eval)