Anvil.Agreement (Anvil v0.1.1)

Inter-rater agreement metrics for measuring labeler consistency.

Automatically selects the appropriate metric based on the data.

Summary

Functions

compute(labels, opts \\ [])

Computes agreement metric, automatically selecting the appropriate algorithm.

compute_all_dimensions(labels, schema, opts \\ [])

Computes agreement for all dimensions in the schema.

compute_for_field(labels, field_name, opts \\ [])

Computes agreement for a specific dimension/field.

recompute_all(queue_id, opts \\ [])

Batch recomputes agreement for all samples in a queue.

summary(labels, schema, opts \\ [])

Returns a comprehensive agreement summary with per-dimension breakdown.

Functions

compute(labels, opts \\ [])

@spec compute(
  [Anvil.Label.t()],
  keyword()
) :: {:ok, float()} | {:error, term()}

Computes agreement metric, automatically selecting the appropriate algorithm.

Options

:metric - Force a specific metric (:cohen, :fleiss, :krippendorff)
:field - Field name to compute agreement for (default: uses all fields)

compute_all_dimensions(labels, schema, opts \\ [])

@spec compute_all_dimensions([map()], map(), keyword()) :: map()

Computes agreement for all dimensions in the schema.

Returns a map with per-dimension agreement scores.

Examples

iex> labels = [...]
iex> schema = %{fields: ["coherence", "grounded", "balance"]}
iex> Agreement.compute_all_dimensions(labels, schema)
%{
  coherence: {:ok, 0.72},
  grounded: {:ok, 0.85},
  balance: {:ok, 0.45}
}

compute_for_field(labels, field_name, opts \\ [])

@spec compute_for_field([map()], String.t(), keyword()) ::
  {:ok, float()} | {:error, term()}

Computes agreement for a specific dimension/field.

Examples

iex> labels = [
...>   %{labeler_id: "l1", values: %{"coherence" => 4, "grounded" => 3}},
...>   %{labeler_id: "l2", values: %{"coherence" => 4, "grounded" => 5}}
...> ]
iex> Agreement.compute_for_field(labels, "coherence")
{:ok, 1.0}  # Perfect agreement on coherence

recompute_all(queue_id, opts \\ [])

@spec recompute_all(
  binary(),
  keyword()
) :: {:ok, map()} | {:error, term()}

Batch recomputes agreement for all samples in a queue.

This is useful for full recalculation after schema migrations or data changes.

Options

:batch_size - Number of samples to process per batch (default: 100)
:metric - Force a specific metric for all computations

summary(labels, schema, opts \\ [])

@spec summary([map()], map(), keyword()) :: map()

Returns a comprehensive agreement summary with per-dimension breakdown.

Examples

iex> labels = [...]
iex> schema = %{fields: ["coherence", "grounded"]}
iex> Agreement.summary(labels, schema)
%{
  overall: {:ok, 0.78},
  by_dimension: %{
    coherence: {:ok, 0.72},
    grounded: {:ok, 0.85}
  },
  sample_count: 50,
  labeler_count: 3
}