ExNlp.Tokenizer.Base (ex_nlp v0.1.0)
View SourceBase module for tokenizer implementations.
Defines common types and helper functions for tokenizers.
Summary
Types
A span representing the start and end offsets of a token
A token with text, position, and offset information
Functions
Converts tokens to spans (start_offset, end_offset tuples).
Extracts just the text from tokens.
Types
@type span() :: {non_neg_integer(), non_neg_integer()}
A span representing the start and end offsets of a token
@type token() :: ExNlp.Token.t()
A token with text, position, and offset information