Streaming encoder state.
Use this when you want to feed a tokenizer incrementally from multiple binary chunks while preserving the same output you would get from one-shot encoding of the full input.
Summary
Functions
Feeds a binary chunk into the stream and returns any newly produced token IDs.
Flushes any remaining state and returns the final token IDs.
Creates a new encode stream for the given tokenizer.
Types
Functions
Feeds a binary chunk into the stream and returns any newly produced token IDs.
Flushes any remaining state and returns the final token IDs.
@spec new( IREE.Tokenizers.Tokenizer.t(), keyword() ) :: {:ok, t()} | {:error, {atom(), binary()}}
Creates a new encode stream for the given tokenizer.
Options:
:add_special_tokens- whether post-processing special tokens should be emitted during finalization, defaults totrue:max_chunk_bytes- maximum chunk size expected byfeed/2, defaults to65536