# `Chunx.Chunker.Token`
[🔗](https://github.com/preciz/chunx/blob/main/lib/chunx/chunker/token.ex#L1)

Implements token based chunking strategy.

Splits text into overlapping chunks based on token count using the given tokenizer.

# `chunk_opts`

```elixir
@type chunk_opts() :: [
  chunk_size: pos_integer(),
  chunk_overlap: pos_integer() | float()
]
```

# `chunk`

```elixir
@spec chunk(binary(), Tokenizers.Tokenizer.t(), chunk_opts()) ::
  {:ok, [Chunk.t()]} | {:error, term()}
```

Splits text into overlapping chunks using the given tokenizer.

## Options
  * `:chunk_size` - Maximum number of tokens per chunk (default: 512)
  * `:chunk_overlap` - Number of tokens (integer) or percentage (float between 0 and 1) to overlap between chunks (default: 0.25)

## Examples

    iex> {:ok, tokenizer} = Tokenizers.Tokenizer.from_pretrained("distilbert/distilbert-base-uncased")
    iex> Chunx.Chunker.Token.chunk("Some text to split", tokenizer, chunk_size: 3, chunk_overlap: 1)
    {
      :ok,
      [
        %Chunx.Chunk{end_byte: 12, start_byte: 0, text: "Some text to", token_count: 3},
        %Chunx.Chunk{end_byte: 18, start_byte: 10, text: "to split", token_count: 2}
      ]
    }

---

*Consult [api-reference.md](api-reference.md) for complete listing*