deepeval.md
1 --- 2 title: "DeepEval" 3 id: integrations-deepeval 4 description: "DeepEval integration for Haystack" 5 slug: "/integrations-deepeval" 6 --- 7 8 <a id="haystack_integrations.components.evaluators.deepeval.evaluator"></a> 9 10 ## Module haystack\_integrations.components.evaluators.deepeval.evaluator 11 12 <a id="haystack_integrations.components.evaluators.deepeval.evaluator.DeepEvalEvaluator"></a> 13 14 ### DeepEvalEvaluator 15 16 A component that uses the [DeepEval framework](https://docs.confident-ai.com/docs/evaluation-introduction) 17 to evaluate inputs against a specific metric. Supported metrics are defined by `DeepEvalMetric`. 18 19 Usage example: 20 ```python 21 from haystack_integrations.components.evaluators.deepeval import DeepEvalEvaluator, DeepEvalMetric 22 23 evaluator = DeepEvalEvaluator( 24 metric=DeepEvalMetric.FAITHFULNESS, 25 metric_params={"model": "gpt-4"}, 26 ) 27 output = evaluator.run( 28 questions=["Which is the most popular global sport?"], 29 contexts=[ 30 [ 31 "Football is undoubtedly the world's most popular sport with" 32 "major events like the FIFA World Cup and sports personalities" 33 "like Ronaldo and Messi, drawing a followership of more than 4" 34 "billion people." 35 ] 36 ], 37 responses=["Football is the most popular sport with around 4 billion" "followers worldwide"], 38 ) 39 print(output["results"]) 40 ``` 41 42 <a id="haystack_integrations.components.evaluators.deepeval.evaluator.DeepEvalEvaluator.__init__"></a> 43 44 #### DeepEvalEvaluator.\_\_init\_\_ 45 46 ```python 47 def __init__(metric: str | DeepEvalMetric, 48 metric_params: dict[str, Any] | None = None) 49 ``` 50 51 Construct a new DeepEval evaluator. 52 53 **Arguments**: 54 55 - `metric`: The metric to use for evaluation. 56 - `metric_params`: Parameters to pass to the metric's constructor. 57 Refer to the `RagasMetric` class for more details 58 on required parameters. 59 60 <a id="haystack_integrations.components.evaluators.deepeval.evaluator.DeepEvalEvaluator.run"></a> 61 62 #### DeepEvalEvaluator.run 63 64 ```python 65 @component.output_types(results=list[list[dict[str, Any]]]) 66 def run(**inputs: Any) -> dict[str, Any] 67 ``` 68 69 Run the DeepEval evaluator on the provided inputs. 70 71 **Arguments**: 72 73 - `inputs`: The inputs to evaluate. These are determined by the 74 metric being calculated. See `DeepEvalMetric` for more 75 information. 76 77 **Returns**: 78 79 A dictionary with a single `results` entry that contains 80 a nested list of metric results. Each input can have one or more 81 results, depending on the metric. Each result is a dictionary 82 containing the following keys and values: 83 - `name` - The name of the metric. 84 - `score` - The score of the metric. 85 - `explanation` - An optional explanation of the score. 86 87 <a id="haystack_integrations.components.evaluators.deepeval.evaluator.DeepEvalEvaluator.to_dict"></a> 88 89 #### DeepEvalEvaluator.to\_dict 90 91 ```python 92 def to_dict() -> dict[str, Any] 93 ``` 94 95 Serializes the component to a dictionary. 96 97 **Raises**: 98 99 - `DeserializationError`: If the component cannot be serialized. 100 101 **Returns**: 102 103 Dictionary with serialized data. 104 105 <a id="haystack_integrations.components.evaluators.deepeval.evaluator.DeepEvalEvaluator.from_dict"></a> 106 107 #### DeepEvalEvaluator.from\_dict 108 109 ```python 110 @classmethod 111 def from_dict(cls, data: dict[str, Any]) -> "DeepEvalEvaluator" 112 ``` 113 114 Deserializes the component from a dictionary. 115 116 **Arguments**: 117 118 - `data`: Dictionary to deserialize from. 119 120 **Returns**: 121 122 Deserialized component. 123 124 <a id="haystack_integrations.components.evaluators.deepeval.metrics"></a> 125 126 ## Module haystack\_integrations.components.evaluators.deepeval.metrics 127 128 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric"></a> 129 130 ### DeepEvalMetric 131 132 Metrics supported by DeepEval. 133 134 All metrics require a `model` parameter, which specifies 135 the model to use for evaluation. Refer to the DeepEval 136 documentation for information on the supported models. 137 138 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric.ANSWER_RELEVANCY"></a> 139 140 #### ANSWER\_RELEVANCY 141 142 Answer relevancy.\ 143 Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str]` 144 145 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric.FAITHFULNESS"></a> 146 147 #### FAITHFULNESS 148 149 Faithfulness.\ 150 Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str]` 151 152 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric.CONTEXTUAL_PRECISION"></a> 153 154 #### CONTEXTUAL\_PRECISION 155 156 Contextual precision.\ 157 Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str], ground_truths: List[str]`\ 158 The ground truth is the expected response. 159 160 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric.CONTEXTUAL_RECALL"></a> 161 162 #### CONTEXTUAL\_RECALL 163 164 Contextual recall.\ 165 Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str], ground_truths: List[str]`\ 166 The ground truth is the expected response.\ 167 168 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric.CONTEXTUAL_RELEVANCE"></a> 169 170 #### CONTEXTUAL\_RELEVANCE 171 172 Contextual relevance.\ 173 Inputs - `questions: List[str], contexts: List[List[str]], responses: List[str]` 174 175 <a id="haystack_integrations.components.evaluators.deepeval.metrics.DeepEvalMetric.from_str"></a> 176 177 #### DeepEvalMetric.from\_str 178 179 ```python 180 @classmethod 181 def from_str(cls, string: str) -> "DeepEvalMetric" 182 ``` 183 184 Create a metric type from a string. 185 186 **Arguments**: 187 188 - `string`: The string to convert. 189 190 **Returns**: 191 192 The metric. 193