CrucibleXAI.Validation.Faithfulness (CrucibleXAI v0.4.0)

Faithfulness metrics for explanation validation.

Measures how well explanations reflect actual model behavior by testing whether removing important features causes proportional prediction changes.

Overview

Faithfulness validation uses the feature removal test: if an explanation claims feature X is important, removing X should significantly change the prediction. The correlation between attribution magnitude and prediction change quantifies faithfulness.

Metrics

Faithfulness Score: Spearman/Pearson correlation between feature importance ranking and prediction change magnitude (range: -1 to 1, higher is better)
Monotonicity: Whether prediction changes increase monotonically as more features are removed (boolean)
Violation Severity: Average magnitude of monotonicity violations

Usage

# Validate LIME explanation
explanation = CrucibleXai.explain(instance, predict_fn, num_samples: 2000)

result = Faithfulness.measure_faithfulness(
  instance,
  explanation,
  predict_fn
)

IO.puts("Faithfulness: #{result.faithfulness_score}")
# => 0.87 (Good)

References

Based on:

Hooker et al. (2019) "A Benchmark for Interpretability Methods"
Yeh et al. (2019) "On the (In)fidelity and Sensitivity of Explanations"

Summary

Types

faithfulness_result()

full_report()

monotonicity_result()

Functions

full_report(instance, explanation, predict_fn, opts \\ [])

Generate comprehensive faithfulness report.

measure_faithfulness(instance, explanation, predict_fn, opts \\ [])

Measure faithfulness via feature removal.

monotonicity_test(instance, explanation, predict_fn, opts \\ [])

Test monotonicity property.

Types

faithfulness_result()

@type faithfulness_result() :: %{
  faithfulness_score: float(),
  prediction_drops: [number()],
  feature_order: [integer()],
  monotonicity: boolean(),
  interpretation: String.t()
}

full_report()

@type full_report() :: %{
  faithfulness_score: float(),
  prediction_drops: [number()],
  feature_order: [integer()],
  monotonicity: boolean(),
  interpretation: String.t(),
  monotonicity_details: monotonicity_result(),
  summary: String.t()
}

monotonicity_result()

@type monotonicity_result() :: %{
  is_monotonic: boolean(),
  violations: non_neg_integer(),
  violation_indices: [integer()],
  severity: float()
}

Functions

full_report(instance, explanation, predict_fn, opts \\ [])

@spec full_report(list(), CrucibleXAI.Explanation.t(), (any() -> any()), keyword()) ::
  full_report()

Generate comprehensive faithfulness report.

Combines feature removal test and monotonicity analysis into a single detailed report with human-readable summary.

Parameters

instance - Instance to test
explanation - Explanation struct
predict_fn - Model prediction function
opts - Same options as measure_faithfulness/4

Returns

Map combining faithfulness metrics, monotonicity details, and summary text.

Examples

report = Faithfulness.full_report(instance, explanation, predict_fn)
IO.puts(report.summary)

measure_faithfulness(instance, explanation, predict_fn, opts \\ [])

@spec measure_faithfulness(
  list(),
  CrucibleXAI.Explanation.t(),
  (any() -> any()),
  keyword()
) ::
  faithfulness_result()

Measure faithfulness via feature removal.

Algorithm

Sort features by absolute attribution (descending)
Remove features incrementally (most important first)
Measure prediction change at each step
Compute correlation between attribution rank and prediction change

Parameters

instance - Instance to test (list of feature values)
explanation - Explanation struct to validate
predict_fn - Model prediction function
opts - Options:
- :baseline_value - Value for removed features (default: 0.0)
- :num_steps - Number of removal steps (default: all features)
- :correlation_method - :pearson or :spearman (default: :spearman)

Returns

Map with:

:faithfulness_score - Correlation coefficient (-1 to 1, higher better)
:prediction_drops - Prediction change at each removal step
:feature_order - Order features were removed (by importance)
:monotonicity - Whether drops are monotonic (boolean)
:interpretation - Human-readable assessment

Examples

# Perfect faithfulness for linear model
predict_fn = fn [x, y] -> 2.0 * x + 3.0 * y end
instance = [5.0, 10.0]
explanation = %Explanation{
  instance: instance,
  feature_weights: %{0 => 2.0, 1 => 3.0},
  method: :lime
}

result = Faithfulness.measure_faithfulness(instance, explanation, predict_fn)
# => %{faithfulness_score: 1.0, monotonicity: true, ...}

monotonicity_test(instance, explanation, predict_fn, opts \\ [])

@spec monotonicity_test(
  list(),
  CrucibleXAI.Explanation.t(),
  (any() -> any()),
  keyword()
) ::
  monotonicity_result()

Test monotonicity property.

Removing features in order of importance should cause monotonically increasing prediction changes (for regression) or decreasing confidence (for classification).

Parameters

instance - Instance to test
explanation - Explanation struct
predict_fn - Model prediction function
opts - Same options as measure_faithfulness/4

Returns

Map with:

:is_monotonic - Whether drops are monotonic (boolean)
:violations - Number of monotonicity violations
:violation_indices - Step indices where violations occurred
:severity - Average violation magnitude

Examples

result = Faithfulness.monotonicity_test(instance, explanation, predict_fn)
# => %{is_monotonic: true, violations: 0, ...}