Nasty.Statistics.Evaluator (Nasty v0.3.0)
View SourceModel evaluation and performance metrics.
Provides standard NLP evaluation metrics for various tasks:
- Classification: Accuracy, precision, recall, F1
- Sequence tagging: Token-level and entity-level metrics
- Parsing: PARSEVAL metrics
Examples
# POS tagging evaluation
gold = [:noun, :verb, :det, :noun]
pred = [:noun, :verb, :adj, :noun]
metrics = Evaluator.classification_metrics(gold, pred)
# => %{accuracy: 0.75, ...}
# Confusion matrix
matrix = Evaluator.confusion_matrix(gold, pred)
Summary
Functions
Calculate accuracy: correct predictions / total predictions.
Calculate classification metrics (accuracy, precision, recall, F1).
Build a confusion matrix.
Entity-level evaluation for NER.
Calculate per-class precision, recall, and F1.
Print a formatted confusion matrix.
Print a formatted classification report.
Functions
Calculate accuracy: correct predictions / total predictions.
Examples
iex> gold = [:a, :b, :c, :a]
iex> pred = [:a, :b, :b, :a]
iex> Evaluator.accuracy(gold, pred)
0.75
Calculate classification metrics (accuracy, precision, recall, F1).
Parameters
gold- List of gold-standard labelspredicted- List of predicted labelsopts- Options:average- Averaging method::micro,:macro,:weighted(default::macro):labels- Specific labels to include (default: all)
Returns
- Map with metrics:
:accuracy- Overall accuracy:precision- Precision score:recall- Recall score:f1- F1 score:support- Number of true instances per class
Build a confusion matrix.
Parameters
gold- Gold-standard labelspredicted- Predicted labelslabels- Optional list of labels to include (default: all unique labels)
Returns
- Map of maps:
%{true_label => %{pred_label => count}}
Examples
iex> gold = [:a, :b, :b, :a]
iex> pred = [:a, :a, :b, :a]
iex> confusion_matrix(gold, pred)
%{a: %{a: 2, b: 0}, b: %{a: 1, b: 1}}
Entity-level evaluation for NER.
Compares predicted and gold entity spans using strict matching.
Parameters
gold_entities- List of gold entities:[{type, start, end}, ...]pred_entities- List of predicted entities:[{type, start, end}, ...]
Returns
- Map with
:precision,:recall,:f1
Calculate per-class precision, recall, and F1.
Parameters
gold- Gold-standard labelspredicted- Predicted labelslabel- The label/class to evaluate
Returns
- Map with
:precision,:recall,:f1,:support
@spec print_confusion_matrix(map()) :: :ok
Print a formatted confusion matrix.
Examples
iex> matrix = confusion_matrix(gold, pred)
iex> print_confusion_matrix(matrix)
# Prints a nicely formatted table
@spec print_report(map()) :: :ok
Print a formatted classification report.
Examples
iex> metrics = classification_metrics(gold, pred)
iex> print_report(metrics)
# Prints precision, recall, F1 for each class