A fully-loaded fastText model.
Holds the parsed args and dictionary plus the input and output matrices as
Nx tensors. The input matrix is the largest piece of memory in the
struct: for lid.176.bin it is a {nwords + bucket, dim} tensor of
float32 values, approximately 128 MB.
Models are produced by Text.Language.Classifier.Fasttext.ModelLoader.load/2
from a fastText .bin file.
Fields
argsis the parsedText.Language.Classifier.Fasttext.Argsstruct.dictionaryis the parsedText.Language.Classifier.Fasttext.Dictionarystruct.input_matrixis anNxtensor of shape{args.bucket + dictionary.nwords, args.dim}holding the row-major input embeddings (word rows first, then subword n-gram rows).output_matrixis anNxtensor of shape{dictionary.nlabels, args.dim}holding the per-label output vectors.labelsis the list of language label strings (with the__label__prefix stripped) in row order matchingoutput_matrix. Forlid.176this is a 176-element list such as["en", "zh-Hans", ...].
Summary
Types
Loss-specific decoding state, built once at load time and reused for every prediction.
Types
@type loss_state() :: Text.Language.Classifier.Fasttext.HuffmanTree.t() | nil
Loss-specific decoding state, built once at load time and reused for every prediction.
For
:hs(hierarchical softmax) the state is aText.Language.Classifier.Fasttext.HuffmanTreeconstructed from the label counts.For
:softmaxno extra state is needed; the field isnil.:nsand:ovaare not yet supported by the inference path.
@type t() :: %Text.Language.Classifier.Fasttext.Model{ args: Text.Language.Classifier.Fasttext.Args.t(), dictionary: Text.Language.Classifier.Fasttext.Dictionary.t(), input_matrix: Nx.Tensor.t(), labels: [String.t()], loss_state: loss_state(), output_matrix: Nx.Tensor.t() }