ExFairness.Stage (ExFairness v0.5.1)
View SourcePipeline stage for fairness evaluation in CrucibleIR-based experiments.
This stage integrates ExFairness metrics into the Crucible framework pipeline, allowing fairness evaluation to be seamlessly incorporated into LLM reliability experiments and model evaluations.
Configuration
The stage uses CrucibleIR.Reliability.Fairness configuration from the experiment context:
%CrucibleIR.Reliability.Fairness{
enabled: true, # Enable fairness evaluation
metrics: [:demographic_parity, :equalized_odds, :equal_opportunity, :predictive_parity],
group_by: :gender, # Sensitive attribute field name
threshold: 0.1, # Maximum acceptable disparity
fail_on_violation: false, # Whether to fail experiment on fairness violation
options: %{} # Additional metric-specific options
}Context Requirements
The stage expects the context to contain:
experiment.reliability.fairness- Fairness configuration (CrucibleIR.Reliability.Fairness struct)outputs- List of model outputs, where each output is a map containing::prediction- Binary prediction (0 or 1):label- Ground truth label (0 or 1):probabilities- (Optional) Prediction probabilities for calibration- Sensitive attribute field (e.g.,
:gender,:race) matchinggroup_by
Returns
The stage returns {:ok, updated_context} with fairness results added to the context:
context.fairness = %{
metrics: %{
demographic_parity: %{disparity: 0.05, passes: true, ...},
equalized_odds: %{tpr_disparity: 0.03, fpr_disparity: 0.04, passes: true, ...},
...
},
overall_passes: true,
violations: []
}If fail_on_violation is true and fairness violations are detected, returns {:error, reason}.
Example Usage
# In a Crucible experiment configuration
config = %CrucibleIR.Reliability.Fairness{
enabled: true,
metrics: [:demographic_parity, :equalized_odds],
group_by: :gender,
threshold: 0.1,
fail_on_violation: false
}
# In pipeline
context = %{
experiment: %{reliability: %{fairness: config}},
outputs: [
%{prediction: 1, label: 1, gender: 0},
%{prediction: 0, label: 0, gender: 0},
%{prediction: 1, label: 1, gender: 1},
%{prediction: 0, label: 0, gender: 1}
]
}
{:ok, result_context} = ExFairness.Stage.run(context)
# result_context.fairness contains fairness evaluation resultsIntegration with Crucible Framework
This stage is designed to work with the Crucible framework's experiment orchestration. It can be added to any pipeline that processes model outputs and requires fairness evaluation.
See the Crucible documentation for more details on pipeline stages and experiment configuration.
Summary
Functions
Returns a description of the stage for pipeline documentation.
Runs fairness evaluation on model outputs in the context.
Types
Functions
Returns a description of the stage for pipeline documentation.
Parameters
opts- Options (currently unused)
Returns
A string describing the stage's purpose and behavior.
Examples
iex> ExFairness.Stage.describe()
"Fairness evaluation stage: Computes fairness metrics (demographic parity, equalized odds, etc.) on model outputs"
Runs fairness evaluation on model outputs in the context.
Parameters
context- Experiment context containing fairness config and model outputsopts- Additional options (currently unused, reserved for future extensions)
Returns
{:ok, updated_context}- Context with fairness results added{:error, reason}- If configuration is invalid or fairness violations detected (whenfail_on_violationis true)
Examples
iex> config = %CrucibleIR.Reliability.Fairness{
...> enabled: true,
...> metrics: [:demographic_parity],
...> group_by: :gender,
...> threshold: 0.1
...> }
iex> context = %{
...> experiment: %{reliability: %{fairness: config}},
...> outputs: [
...> %{prediction: 1, label: 1, gender: 0},
...> %{prediction: 1, label: 1, gender: 0},
...> %{prediction: 0, label: 0, gender: 0},
...> %{prediction: 0, label: 0, gender: 0},
...> %{prediction: 1, label: 1, gender: 1},
...> %{prediction: 1, label: 1, gender: 1},
...> %{prediction: 0, label: 0, gender: 1},
...> %{prediction: 0, label: 0, gender: 1}
...> ]
...> }
iex> {:ok, result} = ExFairness.Stage.run(context)
iex> is_map(result.fairness)
true