IREE.Tokenizers.Model.Unigram (iree_tokenizers v0.7.0)

Copy Markdown View Source

Unigram model specification compatible with IREE.Tokenizers.Tokenizer.init/1.

This model shape is also used internally when SentencePiece Unigram tokenizers are translated into the IREE-backed runtime format.

Summary

Types

Options for Unigram model construction.

Functions

Returns an empty Unigram model specification.

Builds a Unigram model specification from an in-memory scored vocabulary.

Types

options()

@type options() :: [byte_fallback: boolean(), unk_id: integer()]

Options for Unigram model construction.

Functions

empty()

@spec empty() :: {:ok, IREE.Tokenizers.Model.t()}

Returns an empty Unigram model specification.

init(vocab, options \\ [])

@spec init([{String.t(), number()}], options()) :: {:ok, IREE.Tokenizers.Model.t()}

Builds a Unigram model specification from an in-memory scored vocabulary.