DeepEvalEx.Result (DeepEvalEx v0.1.0)
View SourceRepresents the result of evaluating a test case against a metric.
Fields
:metric- Name of the metric that produced this result:score- Numeric score from 0.0 to 1.0:success- Whether the score meets the threshold:reason- Explanation for the score (from LLM-based metrics):threshold- The threshold used for pass/fail determination:metadata- Additional metric-specific data:evaluation_cost- Cost of the LLM calls for this evaluation:latency_ms- Time taken for the evaluation in milliseconds
Examples
%DeepEvalEx.Result{
metric: "Faithfulness",
score: 0.85,
success: true,
reason: "4 out of 5 claims are supported by the retrieval context.",
threshold: 0.5,
metadata: %{
claims: ["claim1", "claim2", "claim3", "claim4", "claim5"],
verdicts: [:yes, :yes, :yes, :yes, :no]
},
evaluation_cost: 0.002,
latency_ms: 1250
}
Summary
Functions
Creates a new result struct.
Checks if the result is successful (score >= threshold).
Returns a human-readable summary of the result.
Types
Functions
Creates a new result struct.
Options
:metric- Name of the metric (required):score- Numeric score 0.0-1.0 (required):threshold- Pass/fail threshold (default: 0.5):reason- Explanation for the score:metadata- Additional data:evaluation_cost- LLM API cost:latency_ms- Evaluation time
Examples
DeepEvalEx.Result.new(
metric: "GEval",
score: 0.8,
threshold: 0.5,
reason: "The response is accurate and relevant."
)
Checks if the result is successful (score >= threshold).
Returns a human-readable summary of the result.