DeepEvalEx.TestCase (DeepEvalEx v0.1.0)
View SourceRepresents a test case for LLM evaluation.
A test case contains the input, actual output, and optional context for evaluating LLM responses.
Fields
:input- The input prompt sent to the LLM (required):actual_output- The LLM's response to evaluate (required for most metrics):expected_output- The expected/ground truth output (optional):retrieval_context- List of retrieved context chunks for RAG evaluation:context- Alias for retrieval_context (for compatibility):tools_called- List of tool calls made by the LLM:expected_tools- Expected tool calls for tool use evaluation:metadata- Additional metadata for the test case
Examples
# Basic test case
test_case = %DeepEvalEx.TestCase{
input: "What is the capital of France?",
actual_output: "The capital of France is Paris."
}
# RAG evaluation test case
test_case = %DeepEvalEx.TestCase{
input: "What are the benefits of exercise?",
actual_output: "Exercise improves cardiovascular health and mood.",
retrieval_context: [
"Regular exercise strengthens the heart and improves circulation.",
"Physical activity releases endorphins, improving mental well-being."
]
}
# With expected output
test_case = %DeepEvalEx.TestCase{
input: "Summarize: The quick brown fox jumps over the lazy dog.",
actual_output: "A fox jumped over a dog.",
expected_output: "A fox leaps over a resting dog."
}
Summary
Functions
Returns the effective retrieval context, preferring :retrieval_context over :context.
Creates a new test case struct.
Creates a new test case struct, raising on error.
Validates that the test case has the required parameters for a given metric.
Types
@type t() :: %DeepEvalEx.TestCase{ actual_output: String.t() | nil, context: [String.t()] | nil, expected_output: String.t() | nil, expected_tools: [DeepEvalEx.Schemas.ToolCall.t()] | nil, input: String.t(), metadata: map() | nil, name: String.t() | nil, retrieval_context: [String.t()] | nil, tags: [String.t()] | nil, tools_called: [DeepEvalEx.Schemas.ToolCall.t()] | nil }
Functions
Returns the effective retrieval context, preferring :retrieval_context over :context.
@spec new(keyword() | map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}
Creates a new test case struct.
Options
:input- The input prompt (required):actual_output- The LLM's response:expected_output- Expected output for comparison:retrieval_context- List of retrieved context strings:context- Alias for retrieval_context:tools_called- List of tool calls made:expected_tools- Expected tool calls:metadata- Additional metadata map
Creates a new test case struct, raising on error.
Validates that the test case has the required parameters for a given metric.
Handles aliases:
:contextand:retrieval_contextare interchangeable