Forward-pass scoring for fastText models.
Given a flat list of input-matrix row indices (produced by
Text.Language.Classifier.Fasttext.Features.extract/2), this module
computes a hidden vector and projects it to a list of {label, probability}
pairs, sorted descending by probability.
Two output projections are supported:
:softmax—softmax(output_matrix · hidden). The standard form.:hs(hierarchical softmax) — root-to-leaf DFS over the Huffman tree built at load time. Each internal node carries a learned vector inoutput_matrix[node - osz]; the score of a leaf is the sum oflog(sigmoid(±dot))decisions along its path. This is the projectionlid.176uses.
Mirrors Model::predict and HierarchicalSoftmaxLoss::dfs from the
fastText source (src/model.cc, src/loss.cc).
Numerical conventions
fastText uses std_log(x) = log(x + 1e-5) instead of plain log(x) for
numerical stability when probabilities approach zero. This module uses
the same.
Top-k pruning during DFS matches the C++ heap-based approach: a branch
is skipped once its accumulated score drops below the lowest score
currently in the top-k buffer. For a small model like lid.176 (176
leaves) the speedup is modest, but it preserves bit-equivalence with
the reference's traversal order.
Summary
Functions
Returns the hidden activation vector for a list of feature indices.
Convenience wrapper: tokenize, extract features, and predict in one step.
Predicts the top-k labels with probabilities for a feature index list.
Functions
@spec predict(binary(), Text.Language.Classifier.Fasttext.Model.t(), keyword()) :: [ {String.t(), float()} ]
Convenience wrapper: tokenize, extract features, and predict in one step.
Arguments
textis a UTF-8 binary.modelis a loadedText.Language.Classifier.Fasttext.Model.
Options
Same as predict_features/3.
Returns
[{label, probability}, ...], descending by probability.
Examples
iex> {:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.load("priv/lid_176/lid.176.bin")
iex> [{label, _} | _] = Text.Language.Classifier.Fasttext.Inference.predict("Bonjour le monde", model, k: 3)
iex> label
"fr"
@spec predict_features( [non_neg_integer()], Text.Language.Classifier.Fasttext.Model.t(), keyword() ) :: [ {String.t(), float()} ]
Predicts the top-k labels with probabilities for a feature index list.
Arguments
featuresis the flat feature index list.modelis a fully-loadedText.Language.Classifier.Fasttext.Model.
Options
:k— number of top predictions to return. Defaults to1.:threshold— probability cutoff. Predictions below this are dropped. Defaults to0.0(matches the fastText Python wrapper default).
Returns
- A list of
{label, probability}pairs, sorted descending by probability. May be shorter thankif:thresholdexcludes candidates.
Examples
iex> {:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.load("priv/lid_176/lid.176.bin")
iex> features = Text.Language.Classifier.Fasttext.Features.extract("hello world", model)
iex> [{label, _prob} | _] = Text.Language.Classifier.Fasttext.Inference.predict_features(features, model, k: 3)
iex> label
"en"