CrucibleXAI.Validation.Infidelity (CrucibleXAI v0.4.0)
View SourceInfidelity metric for explanation quality assessment.
Measures squared error between actual model changes and explanation-predicted changes under perturbations. Lower scores indicate more faithful explanations (0 = perfect fidelity).
Mathematical Definition
Infidelity = E[(f(x) - f(x̃) - φᵀ(x - x̃))²]Where:
x= original instancex̃= perturbed instancef= model prediction functionφ= attribution vector (feature importances)
Interpretation
- 0.00 - 0.02: Excellent fidelity
- 0.02 - 0.05: Good fidelity
- 0.05 - 0.10: Acceptable fidelity
- 0.10 - 0.20: Poor fidelity
- > 0.20: Very poor fidelity
Usage
attributions = explanation.feature_weights
result = Infidelity.compute(
instance,
attributions,
predict_fn,
num_perturbations: 100
)
IO.puts("Infidelity: #{result.infidelity_score}")
# => 0.03 (Good)References
Based on:
- Yeh et al. (2019) "On the (In)fidelity and Sensitivity of Explanations", NeurIPS
Summary
Functions
Compare infidelity across multiple explanation methods.
Compute infidelity score.
Sensitivity analysis across perturbation magnitudes.
Types
Functions
Compare infidelity across multiple explanation methods.
Useful for selecting the most faithful explanation method for a given model and instance.
Parameters
instance- Instance to testexplanations- List of explanation structs or attribution mapspredict_fn- Model prediction functionopts- Options passed tocompute/4
Returns
Map with:
:by_method- Map of method_name => infidelity_result:best_method- Method with lowest infidelity:worst_method- Method with highest infidelity:ranking- List of {method, score} sorted by quality
Examples
lime_attrs = %{0 => 2.1, 1 => 2.9}
shap_attrs = %{0 => 2.0, 1 => 3.0}
result = Infidelity.compare_methods(
instance,
[
{:lime, lime_attrs},
{:shap, shap_attrs}
],
predict_fn
)
# => %{best_method: :shap, ...}
Compute infidelity score.
Algorithm
- Generate N perturbations of the instance
- For each perturbation x̃: a. Compute actual model change: Δf = f(x) - f(x̃) b. Compute predicted change via attributions: Δφ = φᵀ(x - x̃) c. Compute squared error: (Δf - Δφ)²
- Return mean squared error across all perturbations
Parameters
instance- Original instance (list of feature values)attributions- Attribution map (feature_index => importance)predict_fn- Model prediction functionopts- Options::num_perturbations- Number of perturbations (default: 100):perturbation_std- Std dev for Gaussian noise (default: 0.1):perturbation_method-:gaussian,:uniform(default::gaussian):normalize- Normalize by prediction variance (default: false)
Returns
Map with:
:infidelity_score- Mean squared error (lower is better, 0 = perfect):std_dev- Standard deviation across perturbations:individual_errors- Error for each perturbation:normalized_score- Normalized by variance (if normalize: true):interpretation- Quality assessment string
Examples
# Perfect attribution (zero infidelity)
attributions = %{0 => 2.0, 1 => 3.0}
predict_fn = fn [x, y] -> 2.0 * x + 3.0 * y end
instance = [5.0, 10.0]
result = Infidelity.compute(instance, attributions, predict_fn)=> %{infidelity_score: ~0.0, interpretation: "Excellent", ...}
Sensitivity analysis across perturbation magnitudes.
Tests how infidelity changes with perturbation size to ensure the metric is robust to the perturbation magnitude choice.
Parameters
instance- Instance to testattributions- Attribution mappredict_fn- Model prediction functionopts- Options::std_range- List of std devs to test (default: [0.05, 0.1, 0.2, 0.5]):num_perturbations- Perturbations per std dev (default: 50)
Returns
Map with:
:infidelity_by_std- Map of std_dev => infidelity_score:is_stable- Whether infidelity is stable across magnitudes:coefficient_of_variation- Measure of stability
Examples
result = Infidelity.sensitivity_to_perturbation(
instance,
attributions,
predict_fn
)
# => %{infidelity_by_std: %{0.05 => 0.03, 0.1 => 0.04, ...}, ...}