LeXtract.TextChunk (lextract v0.1.2)
View SourceRepresents a chunk of text from a document, used for processing long documents.
Fields
:text- The chunk text content:document- Reference to source document:token_interval- Token range in original document:char_interval- Character range in original document:chunk_index- Position in sequence of chunks
Examples
iex> chunk = %LeXtract.TextChunk{
...> text: "Sample chunk",
...> chunk_index: 0
...> }
iex> chunk.chunk_index
0
Summary
Types
@type t() :: %LeXtract.TextChunk{ char_interval: LeXtract.CharInterval.t() | nil, chunk_index: non_neg_integer() | nil, document: LeXtract.Document.t() | nil, text: String.t(), token_interval: LeXtract.TokenInterval.t() | nil }
Functions
@spec char_count(t()) :: non_neg_integer()
Returns the character count of the chunk text.
Examples
iex> chunk = %LeXtract.TextChunk{text: "Hello"}
iex> LeXtract.TextChunk.char_count(chunk)
5
@spec text_byte_size(t()) :: non_neg_integer()
Returns the byte size of the chunk text.
Examples
iex> chunk = %LeXtract.TextChunk{text: "Hello"}
iex> LeXtract.TextChunk.text_byte_size(chunk)
5