Dsxir. EvaluationResult
(dsxir v0.1.0)
Copy Markdown
Result of a single Dsxir.Evaluate.run/2 invocation.
:score—avg(metric_value) * 100, rounded to 1 decimal place.:results— one row per devset entry, in input order. Successful rows carrymetric: float(); errored rows carrymetric: nil, error: %Exception{}.:errors—%{count: non_neg_integer(), by_class: %{atom() => non_neg_integer()}}.:by_classkeys are the splode error class atoms (:adapter,:lm,:invalid,:halted,:framework,:unknown).
Subscribers branch on nil vs. populated; the :errors map is always
present, even when zero errors occurred (then count: 0, by_class: %{}).
Summary
Functions
Build an errored row, attaching the caught exception in place of a prediction.
Build a successful row with its example, prediction, and numeric metric.
Aggregate values into a 0..100 score by averaging and scaling, rounded to
one decimal place. An empty list returns 0.0.
Types
@type row() :: %{ example: Dsxir.Example.t(), prediction: nil | Dsxir.Prediction.t(), metric: nil | float(), error: nil | Exception.t() }
@type t() :: %Dsxir.EvaluationResult{ errors: %{ count: non_neg_integer(), by_class: %{required(atom()) => non_neg_integer()} }, results: [row()], score: float() }
Functions
@spec error_row(Dsxir.Example.t(), Exception.t()) :: row()
Build an errored row, attaching the caught exception in place of a prediction.
@spec ok_row(Dsxir.Example.t(), Dsxir.Prediction.t(), float()) :: row()
Build a successful row with its example, prediction, and numeric metric.
Aggregate values into a 0..100 score by averaging and scaling, rounded to
one decimal place. An empty list returns 0.0.