CrucibleXAI.Validation.Infidelity (CrucibleXAI v0.4.0)

View Source

Infidelity metric for explanation quality assessment.

Measures squared error between actual model changes and explanation-predicted changes under perturbations. Lower scores indicate more faithful explanations (0 = perfect fidelity).

Mathematical Definition

Infidelity = E[(f(x) - f() - φᵀ(x - ))²]

Where:

  • x = original instance
  • = perturbed instance
  • f = model prediction function
  • φ = attribution vector (feature importances)

Interpretation

  • 0.00 - 0.02: Excellent fidelity
  • 0.02 - 0.05: Good fidelity
  • 0.05 - 0.10: Acceptable fidelity
  • 0.10 - 0.20: Poor fidelity
  • > 0.20: Very poor fidelity

Usage

attributions = explanation.feature_weights

result = Infidelity.compute(
  instance,
  attributions,
  predict_fn,
  num_perturbations: 100
)

IO.puts("Infidelity: #{result.infidelity_score}")
# => 0.03 (Good)

References

Based on:

  • Yeh et al. (2019) "On the (In)fidelity and Sensitivity of Explanations", NeurIPS

Summary

Functions

Compare infidelity across multiple explanation methods.

Sensitivity analysis across perturbation magnitudes.

Types

result()

@type result() :: %{
  infidelity_score: float(),
  std_dev: float(),
  individual_errors: [float()],
  normalized_score: float(),
  interpretation: String.t()
}

Functions

compare_methods(instance, explanations, predict_fn, opts \\ [])

@spec compare_methods(list(), list(), function(), keyword()) :: map()

Compare infidelity across multiple explanation methods.

Useful for selecting the most faithful explanation method for a given model and instance.

Parameters

  • instance - Instance to test
  • explanations - List of explanation structs or attribution maps
  • predict_fn - Model prediction function
  • opts - Options passed to compute/4

Returns

Map with:

  • :by_method - Map of method_name => infidelity_result
  • :best_method - Method with lowest infidelity
  • :worst_method - Method with highest infidelity
  • :ranking - List of {method, score} sorted by quality

Examples

lime_attrs = %{0 => 2.1, 1 => 2.9}
shap_attrs = %{0 => 2.0, 1 => 3.0}

result = Infidelity.compare_methods(
  instance,
  [
    {:lime, lime_attrs},
    {:shap, shap_attrs}
  ],
  predict_fn
)
# => %{best_method: :shap, ...}

compute(instance, attributions, predict_fn, opts \\ [])

@spec compute(list(), map(), (any() -> any()), keyword()) :: result()

Compute infidelity score.

Algorithm

  1. Generate N perturbations of the instance
  2. For each perturbation x̃: a. Compute actual model change: Δf = f(x) - f(x̃) b. Compute predicted change via attributions: Δφ = φᵀ(x - x̃) c. Compute squared error: (Δf - Δφ)²
  3. Return mean squared error across all perturbations

Parameters

  • instance - Original instance (list of feature values)
  • attributions - Attribution map (feature_index => importance)
  • predict_fn - Model prediction function
  • opts - Options:
    • :num_perturbations - Number of perturbations (default: 100)
    • :perturbation_std - Std dev for Gaussian noise (default: 0.1)
    • :perturbation_method - :gaussian, :uniform (default: :gaussian)
    • :normalize - Normalize by prediction variance (default: false)

Returns

Map with:

  • :infidelity_score - Mean squared error (lower is better, 0 = perfect)
  • :std_dev - Standard deviation across perturbations
  • :individual_errors - Error for each perturbation
  • :normalized_score - Normalized by variance (if normalize: true)
  • :interpretation - Quality assessment string

Examples

# Perfect attribution (zero infidelity)
attributions = %{0 => 2.0, 1 => 3.0}
predict_fn = fn [x, y] -> 2.0 * x + 3.0 * y end
instance = [5.0, 10.0]

result = Infidelity.compute(instance, attributions, predict_fn)

=> %{infidelity_score: ~0.0, interpretation: "Excellent", ...}

sensitivity_to_perturbation(instance, attributions, predict_fn, opts \\ [])

@spec sensitivity_to_perturbation(list(), map(), (any() -> any()), keyword()) :: map()

Sensitivity analysis across perturbation magnitudes.

Tests how infidelity changes with perturbation size to ensure the metric is robust to the perturbation magnitude choice.

Parameters

  • instance - Instance to test
  • attributions - Attribution map
  • predict_fn - Model prediction function
  • opts - Options:
    • :std_range - List of std devs to test (default: [0.05, 0.1, 0.2, 0.5])
    • :num_perturbations - Perturbations per std dev (default: 50)

Returns

Map with:

  • :infidelity_by_std - Map of std_dev => infidelity_score
  • :is_stable - Whether infidelity is stable across magnitudes
  • :coefficient_of_variation - Measure of stability

Examples

result = Infidelity.sensitivity_to_perturbation(
  instance,
  attributions,
  predict_fn
)
# => %{infidelity_by_std: %{0.05 => 0.03, 0.1 => 0.04, ...}, ...}