Nous.Eval.Result (nous v0.13.3)
View SourceResult of a single test case evaluation.
Contains the actual output, evaluation score, metrics, and any errors.
Fields
:test_case_id- ID of the test case:test_case_name- Display name of the test case:passed- Whether the test passed:score- Numeric score (0.0 to 1.0):actual_output- The output from the agent:expected_output- The expected output:evaluation_details- Details from the evaluator:metrics- Collected metrics (tokens, latency, etc.):error- Error if the test failed to run:duration_ms- Total test duration in milliseconds:run_at- When the test was run
Summary
Functions
Create an error result (test failed to run).
Create a failed result.
Check if the result has an error (test didn't complete).
Create a successful result.
Types
@type t() :: %Nous.Eval.Result{ actual_output: term(), agent_result: map() | nil, duration_ms: non_neg_integer(), error: term() | nil, evaluation_details: map(), expected_output: term(), metrics: Nous.Eval.Metrics.t() | nil, passed: boolean(), run_at: DateTime.t(), score: float(), test_case_id: String.t(), test_case_name: String.t() }
Functions
Create an error result (test failed to run).
Create a failed result.
Check if the result has an error (test didn't complete).
Create a successful result.