Anvil.Agreement (Anvil v0.1.1)
View SourceInter-rater agreement metrics for measuring labeler consistency.
Automatically selects the appropriate metric based on the data.
Summary
Functions
Computes agreement metric, automatically selecting the appropriate algorithm.
Computes agreement for all dimensions in the schema.
Computes agreement for a specific dimension/field.
Batch recomputes agreement for all samples in a queue.
Returns a comprehensive agreement summary with per-dimension breakdown.
Functions
@spec compute( [Anvil.Label.t()], keyword() ) :: {:ok, float()} | {:error, term()}
Computes agreement metric, automatically selecting the appropriate algorithm.
Options
:metric- Force a specific metric (:cohen, :fleiss, :krippendorff):field- Field name to compute agreement for (default: uses all fields)
Computes agreement for all dimensions in the schema.
Returns a map with per-dimension agreement scores.
Examples
iex> labels = [...]
iex> schema = %{fields: ["coherence", "grounded", "balance"]}
iex> Agreement.compute_all_dimensions(labels, schema)
%{
coherence: {:ok, 0.72},
grounded: {:ok, 0.85},
balance: {:ok, 0.45}
}
Computes agreement for a specific dimension/field.
Examples
iex> labels = [
...> %{labeler_id: "l1", values: %{"coherence" => 4, "grounded" => 3}},
...> %{labeler_id: "l2", values: %{"coherence" => 4, "grounded" => 5}}
...> ]
iex> Agreement.compute_for_field(labels, "coherence")
{:ok, 1.0} # Perfect agreement on coherence
Batch recomputes agreement for all samples in a queue.
This is useful for full recalculation after schema migrations or data changes.
Options
:batch_size- Number of samples to process per batch (default: 100):metric- Force a specific metric for all computations
Returns a comprehensive agreement summary with per-dimension breakdown.
Examples
iex> labels = [...]
iex> schema = %{fields: ["coherence", "grounded"]}
iex> Agreement.summary(labels, schema)
%{
overall: {:ok, 0.78},
by_dimension: %{
coherence: {:ok, 0.72},
grounded: {:ok, 0.85}
},
sample_count: 50,
labeler_count: 3
}