Anvil.Agreement (Anvil v0.1.1)

View Source

Inter-rater agreement metrics for measuring labeler consistency.

Automatically selects the appropriate metric based on the data.

Summary

Functions

Computes agreement metric, automatically selecting the appropriate algorithm.

Computes agreement for all dimensions in the schema.

Computes agreement for a specific dimension/field.

Batch recomputes agreement for all samples in a queue.

Returns a comprehensive agreement summary with per-dimension breakdown.

Functions

compute(labels, opts \\ [])

@spec compute(
  [Anvil.Label.t()],
  keyword()
) :: {:ok, float()} | {:error, term()}

Computes agreement metric, automatically selecting the appropriate algorithm.

Options

  • :metric - Force a specific metric (:cohen, :fleiss, :krippendorff)
  • :field - Field name to compute agreement for (default: uses all fields)

compute_all_dimensions(labels, schema, opts \\ [])

@spec compute_all_dimensions([map()], map(), keyword()) :: map()

Computes agreement for all dimensions in the schema.

Returns a map with per-dimension agreement scores.

Examples

iex> labels = [...]
iex> schema = %{fields: ["coherence", "grounded", "balance"]}
iex> Agreement.compute_all_dimensions(labels, schema)
%{
  coherence: {:ok, 0.72},
  grounded: {:ok, 0.85},
  balance: {:ok, 0.45}
}

compute_for_field(labels, field_name, opts \\ [])

@spec compute_for_field([map()], String.t(), keyword()) ::
  {:ok, float()} | {:error, term()}

Computes agreement for a specific dimension/field.

Examples

iex> labels = [
...>   %{labeler_id: "l1", values: %{"coherence" => 4, "grounded" => 3}},
...>   %{labeler_id: "l2", values: %{"coherence" => 4, "grounded" => 5}}
...> ]
iex> Agreement.compute_for_field(labels, "coherence")
{:ok, 1.0}  # Perfect agreement on coherence

recompute_all(queue_id, opts \\ [])

@spec recompute_all(
  binary(),
  keyword()
) :: {:ok, map()} | {:error, term()}

Batch recomputes agreement for all samples in a queue.

This is useful for full recalculation after schema migrations or data changes.

Options

  • :batch_size - Number of samples to process per batch (default: 100)
  • :metric - Force a specific metric for all computations

summary(labels, schema, opts \\ [])

@spec summary([map()], map(), keyword()) :: map()

Returns a comprehensive agreement summary with per-dimension breakdown.

Examples

iex> labels = [...]
iex> schema = %{fields: ["coherence", "grounded"]}
iex> Agreement.summary(labels, schema)
%{
  overall: {:ok, 0.78},
  by_dimension: %{
    coherence: {:ok, 0.72},
    grounded: {:ok, 0.85}
  },
  sample_count: 50,
  labeler_count: 3
}