# `Text.Language.Classifier.Fasttext.ModelLoader`
[🔗](https://github.com/kipcole9/text/blob/v0.5.0/lib/language/classifier/fasttext/model_loader.ex#L1)

Parses a fastText `.bin` model file into a
`Text.Language.Classifier.Fasttext.Model` struct.

The full byte layout is documented in `docs/lid176_binary_format.md`. In
brief, the file is a sequence of:

* 8-byte magic + version header.

* 56-byte fixed `Args` block.

* Variable-length `Dictionary` (28-byte header + size entries + optional
  prune index).

* 1-byte `quant_input` flag (must be `0` — quantized models are out of
  scope for v1).

* Input matrix (`{nwords + bucket, dim}` float32, row-major).

* 1-byte `qout` flag (must be `0`).

* Output matrix (`{nlabels, dim}` float32, row-major).

The whole file is read into memory and then parsed against an Elixir
binary. For `lid.176.bin` (~126 MB) this means a transient peak of roughly
twice the file size during loading: the original binary and the matrix
tensors live concurrently until the binary becomes garbage collectable.
Models are normally loaded once at boot, so the peak is acceptable.

# `load_error`

```elixir
@type load_error() ::
  {:bad_magic, integer()}
  | {:unsupported_version, integer()}
  | {:quantized_input_unsupported, true}
  | {:quantized_output_unsupported, true}
  | {:input_matrix_shape_mismatch, %{expected: tuple(), actual: tuple()}}
  | {:output_matrix_shape_mismatch, %{expected: tuple(), actual: tuple()}}
  | :truncated_header
  | :truncated_args
  | :truncated_dictionary_header
  | :truncated_entry
  | :truncated_pruneidx
  | :truncated_quant_flag
  | :truncated_matrix_header
  | :truncated_matrix_data
  | :unterminated_word
  | {:unknown_loss, integer()}
  | {:unknown_model, integer()}
  | File.posix()
```

# `decode_model`

```elixir
@spec decode_model(binary(), Nx.Type.t()) ::
  {:ok, Text.Language.Classifier.Fasttext.Model.t()} | {:error, load_error()}
```

Parses a fastText model from an in-memory binary.

Useful for testing against synthetic fixtures and for users who already
have the file contents in memory.

### Arguments

* `binary` is the complete byte sequence of a fastText `.bin` model.

* `tensor_type` is an `Nx` tensor type. Defaults to `{:f, 32}`.

### Returns

* `{:ok, model}` on success.

* `{:error, reason}` on parse or validation failure.

### Examples

    iex> args = <<
    ...>   8::little-32, 0::little-32, 0::little-32, 0::little-32,
    ...>   0::little-32, 1::little-32, 3::little-32, 3::little-32,
    ...>   4::little-32, 2::little-32, 4::little-32, 0::little-32,
    ...>   1.0e-4::little-float-64
    ...> >>
    iex> dict_header = <<
    ...>   1::little-32, 0::little-32, 1::little-32,
    ...>   0::little-64, 0::little-64
    ...> >>
    iex> entry = "__label__en" <> <<0, 7::little-64, 1::little-8>>
    iex> input_dim = 8
    iex> input_rows = 4
    iex> input_zeros = :binary.copy(<<0::little-float-32>>, input_rows * input_dim)
    iex> input_matrix = <<input_rows::little-64, input_dim::little-64>> <> input_zeros
    iex> output_zeros = :binary.copy(<<0::little-float-32>>, 1 * input_dim)
    iex> output_matrix = <<1::little-64, input_dim::little-64>> <> output_zeros
    iex> binary =
    ...>   <<793_712_314::little-32, 12::little-32>> <>
    ...>     args <> dict_header <> entry <>
    ...>     <<0>> <> input_matrix <>
    ...>     <<0>> <> output_matrix
    iex> {:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.decode_model(binary)
    iex> model.labels
    ["en"]

# `load`

```elixir
@spec load(
  Path.t(),
  keyword()
) :: {:ok, Text.Language.Classifier.Fasttext.Model.t()} | {:error, load_error()}
```

Loads and parses a fastText model file.

### Arguments

* `path` is the absolute or relative path to a fastText `.bin` model file.

### Options

* `:tensor_type` is the `Nx` tensor type used for the input and output
  matrices. The default is `{:f, 32}`, matching fastText's on-disk layout.
  Override only if downstream code requires a different precision.

### Returns

* `{:ok, model}` where `model` is a
  `Text.Language.Classifier.Fasttext.Model` struct.

* `{:error, reason}` where `reason` describes the parse or validation
  failure. See `t:load_error/0` for the set of possible reasons.

### Examples

    # Loading the official lid.176 model (after running mix text.download_lid176):
    # iex> path = Path.expand("priv/lid_176/lid.176.bin")
    # iex> {:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.load(path)
    # iex> model.dictionary.nlabels
    # 176

---

*Consult [api-reference.md](api-reference.md) for complete listing*