CrucibleXAI.Validation.Faithfulness (CrucibleXAI v0.4.0)
View SourceFaithfulness metrics for explanation validation.
Measures how well explanations reflect actual model behavior by testing whether removing important features causes proportional prediction changes.
Overview
Faithfulness validation uses the feature removal test: if an explanation claims feature X is important, removing X should significantly change the prediction. The correlation between attribution magnitude and prediction change quantifies faithfulness.
Metrics
- Faithfulness Score: Spearman/Pearson correlation between feature importance ranking and prediction change magnitude (range: -1 to 1, higher is better)
- Monotonicity: Whether prediction changes increase monotonically as more features are removed (boolean)
- Violation Severity: Average magnitude of monotonicity violations
Usage
# Validate LIME explanation
explanation = CrucibleXai.explain(instance, predict_fn, num_samples: 2000)
result = Faithfulness.measure_faithfulness(
instance,
explanation,
predict_fn
)
IO.puts("Faithfulness: #{result.faithfulness_score}")
# => 0.87 (Good)References
Based on:
- Hooker et al. (2019) "A Benchmark for Interpretability Methods"
- Yeh et al. (2019) "On the (In)fidelity and Sensitivity of Explanations"
Summary
Functions
Generate comprehensive faithfulness report.
Measure faithfulness via feature removal.
Test monotonicity property.
Types
@type monotonicity_result() :: %{ is_monotonic: boolean(), violations: non_neg_integer(), violation_indices: [integer()], severity: float() }
Functions
@spec full_report(list(), CrucibleXAI.Explanation.t(), (any() -> any()), keyword()) :: full_report()
Generate comprehensive faithfulness report.
Combines feature removal test and monotonicity analysis into a single detailed report with human-readable summary.
Parameters
instance- Instance to testexplanation- Explanation structpredict_fn- Model prediction functionopts- Same options asmeasure_faithfulness/4
Returns
Map combining faithfulness metrics, monotonicity details, and summary text.
Examples
report = Faithfulness.full_report(instance, explanation, predict_fn)
IO.puts(report.summary)
@spec measure_faithfulness( list(), CrucibleXAI.Explanation.t(), (any() -> any()), keyword() ) :: faithfulness_result()
Measure faithfulness via feature removal.
Algorithm
- Sort features by absolute attribution (descending)
- Remove features incrementally (most important first)
- Measure prediction change at each step
- Compute correlation between attribution rank and prediction change
Parameters
instance- Instance to test (list of feature values)explanation- Explanation struct to validatepredict_fn- Model prediction functionopts- Options::baseline_value- Value for removed features (default: 0.0):num_steps- Number of removal steps (default: all features):correlation_method-:pearsonor:spearman(default::spearman)
Returns
Map with:
:faithfulness_score- Correlation coefficient (-1 to 1, higher better):prediction_drops- Prediction change at each removal step:feature_order- Order features were removed (by importance):monotonicity- Whether drops are monotonic (boolean):interpretation- Human-readable assessment
Examples
# Perfect faithfulness for linear model
predict_fn = fn [x, y] -> 2.0 * x + 3.0 * y end
instance = [5.0, 10.0]
explanation = %Explanation{
instance: instance,
feature_weights: %{0 => 2.0, 1 => 3.0},
method: :lime
}
result = Faithfulness.measure_faithfulness(instance, explanation, predict_fn)
# => %{faithfulness_score: 1.0, monotonicity: true, ...}
@spec monotonicity_test( list(), CrucibleXAI.Explanation.t(), (any() -> any()), keyword() ) :: monotonicity_result()
Test monotonicity property.
Removing features in order of importance should cause monotonically increasing prediction changes (for regression) or decreasing confidence (for classification).
Parameters
instance- Instance to testexplanation- Explanation structpredict_fn- Model prediction functionopts- Same options asmeasure_faithfulness/4
Returns
Map with:
:is_monotonic- Whether drops are monotonic (boolean):violations- Number of monotonicity violations:violation_indices- Step indices where violations occurred:severity- Average violation magnitude
Examples
result = Faithfulness.monotonicity_test(instance, explanation, predict_fn)
# => %{is_monotonic: true, violations: 0, ...}