Tribunal.TestCase (Tribunal v1.3.6)

Copy Markdown View Source

Represents a single evaluation test case.

Fields

  • input - The user query/prompt (required)
  • actual_output - The LLM response to evaluate (required for evaluation)
  • expected_output - Golden/ideal answer for comparison (optional)
  • context - Ground truth context for faithfulness checks (optional)
  • retrieval_context - Actual retrieved docs from RAG (optional)
  • metadata - Additional info like latency, tokens, cost (optional)

Example

test_case = %Tribunal.TestCase{
  input: "What's the return policy?",
  actual_output: "You can return items within 30 days.",
  context: ["Returns accepted within 30 days with receipt."],
  expected_output: "Items can be returned within 30 days with a receipt."
}

Summary

Functions

Creates a new test case from a map or keyword list.

Adds metadata (latency, tokens, cost, etc).

Sets the actual output on an existing test case. Useful when the dataset provides input/context but output comes from your LLM.

Sets the retrieval context from your RAG pipeline.

Types

t()

@type t() :: %Tribunal.TestCase{
  actual_output: String.t() | nil,
  context: [String.t()] | String.t() | nil,
  expected_output: String.t() | nil,
  input: String.t(),
  metadata: map() | nil,
  retrieval_context: [String.t()] | nil
}

Functions

new(attrs)

Creates a new test case from a map or keyword list.

Examples

Tribunal.TestCase.new(input: "Hello", actual_output: "Hi there!")
Tribunal.TestCase.new(%{"input" => "Hello", "actual_output" => "Hi!"})

with_metadata(test_case, metadata)

Adds metadata (latency, tokens, cost, etc).

with_output(test_case, output)

Sets the actual output on an existing test case. Useful when the dataset provides input/context but output comes from your LLM.

with_retrieval_context(test_case, context)

Sets the retrieval context from your RAG pipeline.