Text.Language.Classifier.Fasttext.Args (Text v0.5.0)

Copy Markdown View Source

Training and model hyperparameters extracted from a fastText model file.

Mirrors the C++ fasttext::Args struct as written by Args::save. See docs/lid176_binary_format.md (Section 2) for the exact byte layout.

Most fields are training-time hyperparameters that do not affect inference but are preserved for completeness. The fields that matter at inference time for a lid.176-style supervised model are dim, bucket, minn, and maxn.

Summary

Functions

Decodes the 56-byte Args block that follows the magic + version header.

Types

loss()

@type loss() :: :hs | :ns | :softmax | :ova

model()

@type model() :: :cbow | :sg | :sup

t()

@type t() :: %Text.Language.Classifier.Fasttext.Args{
  bucket: non_neg_integer(),
  dim: non_neg_integer(),
  epoch: non_neg_integer(),
  loss: loss(),
  lr_update_rate: non_neg_integer(),
  maxn: non_neg_integer(),
  min_count: non_neg_integer(),
  minn: non_neg_integer(),
  model: model(),
  neg: non_neg_integer(),
  t: float(),
  word_ngrams: non_neg_integer(),
  ws: non_neg_integer()
}

Functions

decode(arg1)

@spec decode(binary()) :: {:ok, t(), binary()} | {:error, term()}

Decodes the 56-byte Args block that follows the magic + version header.

Arguments

  • binary is the raw byte sequence positioned at the start of the args block. Must contain at least 56 bytes.

Returns

  • {:ok, args, rest} where args is a t/0 struct and rest is the binary remainder positioned at the start of the dictionary block.

  • {:error, reason} if the binary is truncated or contains an unknown loss/model enum value.

Examples

iex> args_bytes = <<
...>   16::little-32, 5::little-32, 1::little-32, 1000::little-32,
...>   5::little-32, 1::little-32, 3::little-32, 3::little-32,
...>   2_000_000::little-32, 2::little-32, 4::little-32, 100::little-32,
...>   1.0e-4::little-float-64
...> >>
iex> {:ok, args, rest} = Text.Language.Classifier.Fasttext.Args.decode(args_bytes)
iex> {args.dim, args.bucket, args.loss, args.model, rest}
{16, 2_000_000, :softmax, :sup, ""}