CrucibleXAI.Validation (CrucibleXAI v0.4.0)
View SourceMain API for explanation validation and quality metrics.
Provides comprehensive validation tools to measure faithfulness, infidelity, sensitivity, and axiom compliance of explanations.
Overview
This module implements state-of-the-art validation metrics for XAI, enabling you to:
- Measure Faithfulness: Do explanations reflect actual model behavior?
- Quantify Infidelity: How accurate are the explanations?
- Test Sensitivity: Are explanations robust to perturbations?
- Verify Axioms: Do explanations satisfy theoretical properties?
Quick Start
# Generate explanation
explanation = CrucibleXai.explain(instance, predict_fn, num_samples: 2000)
# Comprehensive validation
validation = CrucibleXAI.Validation.comprehensive_validation(
explanation,
instance,
predict_fn
)
IO.puts(validation.summary)
# Quick quality check
quick = CrucibleXAI.Validation.quick_validation(
explanation,
instance,
predict_fn
)
if quick.passes_quality_gate do
IO.puts("Explanation is reliable!")
endValidation Metrics
Faithfulness Score
Correlation between feature importance and prediction change when features are removed. Range: -1 to 1 (higher is better).
- >0.9: Excellent
- 0.7-0.9: Good
- 0.5-0.7: Fair
- <0.5: Poor
Infidelity Score
Mean squared error between actual model changes and explanation-predicted changes under perturbations. Range: 0 to ∞ (lower is better).
- <0.02: Excellent
- 0.02-0.05: Good
- 0.05-0.10: Acceptable
- >0.10: Poor
Stability Score
Robustness to input perturbations. Range: 0 to 1 (higher is better).
- >0.95: Excellent
- 0.85-0.95: Good
- 0.70-0.85: Acceptable
- <0.70: Poor
Usage Examples
Example 1: Basic Validation
explanation = CrucibleXai.explain(instance, predict_fn)
faithfulness = CrucibleXAI.Validation.Faithfulness.measure_faithfulness(
instance,
explanation,
predict_fn
)
IO.puts("Faithfulness: #{faithfulness.faithfulness_score}")Example 2: Compare Methods
lime_exp = CrucibleXai.explain(instance, predict_fn)
shap_vals = CrucibleXai.explain_shap(instance, background, predict_fn)
result = CrucibleXAI.Validation.Infidelity.compare_methods(
instance,
[{:lime, lime_exp.feature_weights}, {:shap, shap_vals}],
predict_fn
)
IO.puts("Best method: #{result.best_method}")Example 3: Production Monitoring
defmodule MyApp.XAIMonitor do
def validate_and_serve(instance, prediction) do
explanation = generate_explanation(instance)
quick_validation = CrucibleXAI.Validation.quick_validation(
explanation,
instance,
&MyModel.predict/1
)
if not quick_validation.passes_quality_gate do
Logger.warning("Low quality explanation detected")
Metrics.increment("xai.quality_gate_failed")
end
explanation
end
endReferences
Based on academic research:
- Yeh et al. (2019) "On the (In)fidelity and Sensitivity of Explanations"
- Hooker et al. (2019) "A Benchmark for Interpretability Methods"
- Sundararajan et al. (2017) "Axiomatic Attribution for Deep Networks"
Summary
Functions
Benchmark multiple explanation methods.
Comprehensive validation of an explanation.
Quick validation with essential metrics only.
Types
@type comprehensive_report() :: %{ faithfulness: CrucibleXAI.Validation.Faithfulness.faithfulness_result(), infidelity: CrucibleXAI.Validation.Infidelity.result(), sensitivity: sensitivity_result(), axioms: CrucibleXAI.Validation.Axioms.axioms_result(), quality_score: float(), summary: String.t() }
@type sensitivity_result() :: CrucibleXAI.Validation.Sensitivity.input_result() | %{skipped: true, reason: String.t()}
Functions
Benchmark multiple explanation methods.
Compares validation metrics across different explanation methods to help select the best method for your use case.
Parameters
instance- Instance to explainpredict_fn- Model prediction functionmethods- List of method configurations:{:lime, opts}- LIME with options{:shap, background, opts}- SHAP with background dataset{:gradient, model_fn, opts}- Gradient methods
opts- Global validation options
Returns
Map with:
:by_method- Validation results for each method:ranking- Methods ranked by quality score:best_method- Method with highest quality:comparison_summary- Summary table
Examples
result = CrucibleXAI.Validation.benchmark_methods(
instance,
predict_fn,
[
{:lime, num_samples: 2000},
{:shap, background_data, num_samples: 1000}
]
)
IO.puts(result.comparison_summary)
# Method | Faithfulness | Infidelity | Quality | Time
# --------|--------------|------------|---------|------
# LIME | 0.85 | 0.04 | 0.82 | 45ms
# SHAP | 0.91 | 0.02 | 0.89 | 950ms
@spec comprehensive_validation( CrucibleXAI.Explanation.t(), list(), (any() -> any()), keyword() ) :: comprehensive_report()
Comprehensive validation of an explanation.
Runs all validation metrics and returns a complete quality report. This is the most thorough validation but takes longer to compute.
Parameters
explanation- Explanation struct to validateinstance- Instance that was explainedpredict_fn- Model prediction functionopts- Options::include_sensitivity- Run sensitivity analysis (default: false, adds ~2s):baseline- Baseline for axiom tests:method- Explanation method (:lime, :shap, etc.):num_perturbations- Perturbations for infidelity (default: 100)
Returns
Map with:
:faithfulness- Faithfulness test results:infidelity- Infidelity measurement results:sensitivity- Sensitivity analysis (if enabled):axioms- Axiom verification results:quality_score- Overall quality score (0-1):summary- Human-readable summary
Examples
validation = CrucibleXAI.Validation.comprehensive_validation(
explanation,
instance,
predict_fn,
include_sensitivity: true,
baseline: background_data
)
IO.puts(validation.summary)
# => Overall Quality Score: 0.87 / 1.0
# Faithfulness: 0.92 (Excellent)
# Infidelity: 0.03 (Good)
# ...
@spec quick_validation( CrucibleXAI.Explanation.t(), list(), (any() -> any()), keyword() ) :: quick_report()
Quick validation with essential metrics only.
Runs faithfulness and infidelity tests (fast metrics) for quick quality checks in production environments.
Parameters
explanation- Explanation struct to validateinstance- Instance that was explainedpredict_fn- Model prediction functionopts- Options (same ascomprehensive_validation/4)
Returns
Map with:
:faithfulness_score- Faithfulness correlation:infidelity_score- Infidelity error:passes_quality_gate- Boolean (true if both metrics pass thresholds)
Quality Gate Thresholds
- Faithfulness: >= 0.7
- Infidelity: <= 0.1
Examples
result = CrucibleXAI.Validation.quick_validation(
explanation,
instance,
predict_fn
)
if result.passes_quality_gate do
# Safe to use explanation
serve_explanation_to_user(explanation)
else
# Quality too low
Logger.warning("Explanation quality below threshold")
fallback_explanation()
end