# `Arcana.Chunker`
[🔗](https://github.com/georgeguimaraes/arcana/blob/main/lib/arcana/chunker.ex#L1)

Behaviour for text chunking providers used by Arcana.

Arcana accepts any module that implements this behaviour.
Built-in implementations are provided for:

- `Arcana.Chunker.Default` - Default chunking using text_chunker library

## Configuration

Configure your chunking provider in `config.exs`:

    # Default: text_chunker-based chunking
    config :arcana, chunker: :default

    # Default chunker with custom options
    config :arcana, chunker: {:default, chunk_size: 512, chunk_overlap: 100}

    # Custom function
    config :arcana, chunker: fn text, opts -> [%{text: text, chunk_index: 0, token_count: 10}] end

    # Custom module implementing this behaviour
    config :arcana, chunker: MyApp.SemanticChunker
    config :arcana, chunker: {MyApp.SemanticChunker, model: "..."}

## Implementing a Custom Chunker

Create a module that implements this behaviour:

    defmodule MyApp.SemanticChunker do
      @behaviour Arcana.Chunker

      @impl true
      def chunk(text, opts) do
        # Custom chunking logic...
        # Return list of chunk maps
        [
          %{text: "chunk 1", chunk_index: 0, token_count: 50},
          %{text: "chunk 2", chunk_index: 1, token_count: 45}
        ]
      end
    end

Then configure:

    config :arcana, chunker: {MyApp.SemanticChunker, model: "..."}

## Chunk Format

Each chunk returned must be a map with at minimum:

  * `:text` - The chunk text content (required)
  * `:chunk_index` - Zero-based index of this chunk (required)
  * `:token_count` - Estimated token count (required)

Additional keys may be included and will be passed through to storage.

# `chunk`

```elixir
@callback chunk(text :: String.t(), opts :: keyword()) :: [map()]
```

Splits text into chunks.

Returns a list of chunk maps, each containing at minimum `:text`,
`:chunk_index`, and `:token_count`.

## Options

Options are implementation-specific. Common options include:

  * `:chunk_size` - Maximum chunk size
  * `:chunk_overlap` - Overlap between chunks
  * `:format` - Text format hint (`:plaintext`, `:markdown`, etc.)

# `chunk`

Chunks text using the configured chunker.

The chunker is a `{module, opts}` tuple where module implements
this behaviour.

# `chunk`

Chunks text using the configured chunker, merging additional options.

Useful when you need to override chunker defaults at call time.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
