Dsxir.EvaluationResult (dsxir v0.1.0)

Copy Markdown

Result of a single Dsxir.Evaluate.run/2 invocation.

  • :scoreavg(metric_value) * 100, rounded to 1 decimal place.
  • :results — one row per devset entry, in input order. Successful rows carry metric: float(); errored rows carry metric: nil, error: %Exception{}.
  • :errors%{count: non_neg_integer(), by_class: %{atom() => non_neg_integer()}}. :by_class keys are the splode error class atoms (:adapter, :lm, :invalid, :halted, :framework, :unknown).

Subscribers branch on nil vs. populated; the :errors map is always present, even when zero errors occurred (then count: 0, by_class: %{}).

Summary

Functions

Build an errored row, attaching the caught exception in place of a prediction.

Build a successful row with its example, prediction, and numeric metric.

Aggregate values into a 0..100 score by averaging and scaling, rounded to one decimal place. An empty list returns 0.0.

Types

row()

@type row() :: %{
  example: Dsxir.Example.t(),
  prediction: nil | Dsxir.Prediction.t(),
  metric: nil | float(),
  error: nil | Exception.t()
}

t()

@type t() :: %Dsxir.EvaluationResult{
  errors: %{
    count: non_neg_integer(),
    by_class: %{required(atom()) => non_neg_integer()}
  },
  results: [row()],
  score: float()
}

Functions

error_row(ex, err)

@spec error_row(Dsxir.Example.t(), Exception.t()) :: row()

Build an errored row, attaching the caught exception in place of a prediction.

ok_row(ex, pred, metric)

@spec ok_row(Dsxir.Example.t(), Dsxir.Prediction.t(), float()) :: row()

Build a successful row with its example, prediction, and numeric metric.

score_from(values)

@spec score_from([float()]) :: float()

Aggregate values into a 0..100 score by averaging and scaling, rounded to one decimal place. An empty list returns 0.0.