ExFairness.Utils.Bootstrap (ExFairness v0.5.1)

Bootstrap confidence interval computation for fairness metrics.

Implements stratified bootstrap to preserve group proportions and parallel computation for performance.

Algorithm

Bootstrap resampling provides non-parametric confidence intervals without distributional assumptions:

Compute observed metric: M_obs = M(data)
For i = 1 to B (bootstrap samples): a. Sample n datapoints with replacement: data_i b. Compute M_i = M(data*_i)
Sort {M_1, ..., M_B}
CI_lower = percentile(α/2) CI_upper = percentile(1 - α/2)

Stratified Bootstrap

To preserve group proportions, sample separately from each group:

Sample n_A from group A with replacement
Sample n_B from group B with replacement
Combine samples and compute metric

References

Efron, B., & Tibshirani, R. J. (1994). "An introduction to the bootstrap." CRC press.
Davison, A. C., & Hinkley, D. V. (1997). "Bootstrap methods and their application." Cambridge university press.

Examples

iex> predictions = Nx.tensor([1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])
iex> sensitive = Nx.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
iex> metric_fn = fn [preds, sens] ->
...>   result = ExFairness.demographic_parity(preds, sens)
...>   result.disparity
...> end
iex> result = ExFairness.Utils.Bootstrap.confidence_interval(
...>   [predictions, sensitive],
...>   metric_fn,
...>   n_samples: 100
...> )
iex> {lower, upper} = result.confidence_interval
iex> is_float(lower) and is_float(upper) and lower <= upper
true

Summary

Types

bootstrap_result()

Functions

confidence_interval(data, metric_fn, opts \\ [])

Computes bootstrap confidence interval for a fairness metric.

Types

bootstrap_result()

@type bootstrap_result() :: %{
  point_estimate: float(),
  confidence_interval: {float(), float()},
  confidence_level: float(),
  n_samples: integer(),
  method: :percentile | :basic
}

Functions

confidence_interval(data, metric_fn, opts \\ [])

@spec confidence_interval([Nx.Tensor.t()], function(), keyword()) ::
  bootstrap_result()

Computes bootstrap confidence interval for a fairness metric.

Parameters

data - List of tensors [predictions, labels?, sensitive_attr]
metric_fn - Function computing the metric on data
opts - Options:
- :n_samples - Number of bootstrap samples (default: 1000)
- :confidence_level - Confidence level (default: 0.95)
- :method - Bootstrap method (:percentile or :basic, default: :percentile)
- :stratified - Preserve group proportions (default: true)
- :parallel - Use parallel computation (default: true)
- :seed - Random seed for reproducibility (default: system time)

Returns

Map containing point estimate and confidence interval:

:point_estimate - Observed metric value
:confidence_interval - Tuple {lower, upper}
:confidence_level - Confidence level used
:n_samples - Number of bootstrap samples
:method - Bootstrap method used

Examples

iex> predictions = Nx.tensor([1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])
iex> sensitive = Nx.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
iex> metric_fn = fn [preds, sens] ->
...>   result = ExFairness.demographic_parity(preds, sens)
...>   result.disparity
...> end
iex> result = ExFairness.Utils.Bootstrap.confidence_interval(
...>   [predictions, sensitive],
...>   metric_fn,
...>   n_samples: 100, seed: 42
...> )
iex> result.method
:percentile