ExFairness.Utils.Bootstrap (ExFairness v0.5.1)

View Source

Bootstrap confidence interval computation for fairness metrics.

Implements stratified bootstrap to preserve group proportions and parallel computation for performance.

Algorithm

Bootstrap resampling provides non-parametric confidence intervals without distributional assumptions:

  1. Compute observed metric: M_obs = M(data)
  2. For i = 1 to B (bootstrap samples): a. Sample n datapoints with replacement: data_i b. Compute M_i = M(data*_i)
  3. Sort {M_1, ..., M_B}
  4. CI_lower = percentile(α/2) CI_upper = percentile(1 - α/2)

Stratified Bootstrap

To preserve group proportions, sample separately from each group:

  • Sample n_A from group A with replacement
  • Sample n_B from group B with replacement
  • Combine samples and compute metric

References

  • Efron, B., & Tibshirani, R. J. (1994). "An introduction to the bootstrap." CRC press.
  • Davison, A. C., & Hinkley, D. V. (1997). "Bootstrap methods and their application." Cambridge university press.

Examples

iex> predictions = Nx.tensor([1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])
iex> sensitive = Nx.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
iex> metric_fn = fn [preds, sens] ->
...>   result = ExFairness.demographic_parity(preds, sens)
...>   result.disparity
...> end
iex> result = ExFairness.Utils.Bootstrap.confidence_interval(
...>   [predictions, sensitive],
...>   metric_fn,
...>   n_samples: 100
...> )
iex> {lower, upper} = result.confidence_interval
iex> is_float(lower) and is_float(upper) and lower <= upper
true

Summary

Functions

Computes bootstrap confidence interval for a fairness metric.

Types

bootstrap_result()

@type bootstrap_result() :: %{
  point_estimate: float(),
  confidence_interval: {float(), float()},
  confidence_level: float(),
  n_samples: integer(),
  method: :percentile | :basic
}

Functions

confidence_interval(data, metric_fn, opts \\ [])

@spec confidence_interval([Nx.Tensor.t()], function(), keyword()) ::
  bootstrap_result()

Computes bootstrap confidence interval for a fairness metric.

Parameters

  • data - List of tensors [predictions, labels?, sensitive_attr]
  • metric_fn - Function computing the metric on data
  • opts - Options:
    • :n_samples - Number of bootstrap samples (default: 1000)
    • :confidence_level - Confidence level (default: 0.95)
    • :method - Bootstrap method (:percentile or :basic, default: :percentile)
    • :stratified - Preserve group proportions (default: true)
    • :parallel - Use parallel computation (default: true)
    • :seed - Random seed for reproducibility (default: system time)

Returns

Map containing point estimate and confidence interval:

  • :point_estimate - Observed metric value
  • :confidence_interval - Tuple {lower, upper}
  • :confidence_level - Confidence level used
  • :n_samples - Number of bootstrap samples
  • :method - Bootstrap method used

Examples

iex> predictions = Nx.tensor([1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])
iex> sensitive = Nx.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
iex> metric_fn = fn [preds, sens] ->
...>   result = ExFairness.demographic_parity(preds, sens)
...>   result.disparity
...> end
iex> result = ExFairness.Utils.Bootstrap.confidence_interval(
...>   [predictions, sensitive],
...>   metric_fn,
...>   n_samples: 100, seed: 42
...> )
iex> result.method
:percentile