# `Text.Language.Classifier.Fasttext.Inference`
[🔗](https://github.com/kipcole9/text/blob/v0.5.0/lib/language/classifier/fasttext/inference.ex#L1)

Forward-pass scoring for fastText models.

Given a flat list of input-matrix row indices (produced by
`Text.Language.Classifier.Fasttext.Features.extract/2`), this module
computes a hidden vector and projects it to a list of `{label, probability}`
pairs, sorted descending by probability.

Two output projections are supported:

* `:softmax` — `softmax(output_matrix · hidden)`. The standard form.

* `:hs` (hierarchical softmax) — root-to-leaf DFS over the Huffman tree
  built at load time. Each internal node carries a learned vector in
  `output_matrix[node - osz]`; the score of a leaf is the sum of
  `log(sigmoid(±dot))` decisions along its path. This is the projection
  `lid.176` uses.

Mirrors `Model::predict` and `HierarchicalSoftmaxLoss::dfs` from the
fastText source (`src/model.cc`, `src/loss.cc`).

### Numerical conventions

fastText uses `std_log(x) = log(x + 1e-5)` instead of plain `log(x)` for
numerical stability when probabilities approach zero. This module uses
the same.

Top-k pruning during DFS matches the C++ heap-based approach: a branch
is skipped once its accumulated score drops below the lowest score
currently in the top-k buffer. For a small model like `lid.176` (176
leaves) the speedup is modest, but it preserves bit-equivalence with
the reference's traversal order.

# `compute_hidden`

```elixir
@spec compute_hidden([non_neg_integer()], Nx.Tensor.t()) :: Nx.Tensor.t()
```

Returns the hidden activation vector for a list of feature indices.

### Arguments

* `features` is a list of input-matrix row indices, typically from
  `Text.Language.Classifier.Fasttext.Features.extract/2`.

* `input_matrix` is `model.input_matrix`.

### Returns

* A 1-dimensional `Nx.Tensor` of length `args.dim`. Returns a zero
  vector when the feature list is empty.

# `predict`

```elixir
@spec predict(binary(), Text.Language.Classifier.Fasttext.Model.t(), keyword()) :: [
  {String.t(), float()}
]
```

Convenience wrapper: tokenize, extract features, and predict in one step.

### Arguments

* `text` is a UTF-8 binary.

* `model` is a loaded `Text.Language.Classifier.Fasttext.Model`.

### Options

Same as `predict_features/3`.

### Returns

* `[{label, probability}, ...]`, descending by probability.

### Examples

    iex> {:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.load("priv/lid_176/lid.176.bin")
    iex> [{label, _} | _] = Text.Language.Classifier.Fasttext.Inference.predict("Bonjour le monde", model, k: 3)
    iex> label
    "fr"

# `predict_features`

```elixir
@spec predict_features(
  [non_neg_integer()],
  Text.Language.Classifier.Fasttext.Model.t(),
  keyword()
) :: [
  {String.t(), float()}
]
```

Predicts the top-k labels with probabilities for a feature index list.

### Arguments

* `features` is the flat feature index list.

* `model` is a fully-loaded
  `Text.Language.Classifier.Fasttext.Model`.

### Options

* `:k` — number of top predictions to return. Defaults to `1`.

* `:threshold` — probability cutoff. Predictions below this are dropped.
  Defaults to `0.0` (matches the fastText Python wrapper default).

### Returns

* A list of `{label, probability}` pairs, sorted descending by
  probability. May be shorter than `k` if `:threshold` excludes
  candidates.

### Examples

    iex> {:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.load("priv/lid_176/lid.176.bin")
    iex> features = Text.Language.Classifier.Fasttext.Features.extract("hello world", model)
    iex> [{label, _prob} | _] = Text.Language.Classifier.Fasttext.Inference.predict_features(features, model, k: 3)
    iex> label
    "en"

---

*Consult [api-reference.md](api-reference.md) for complete listing*
