Rag.Chunker.FormatAware (rag v0.3.4)

View Source

Format-aware chunking using TextChunker.

Provides intelligent splitting for code and markup formats using language-specific separators (function definitions, class declarations, heading levels, etc.).

Options

  • format - Document format (default: :plaintext)
  • chunk_size - Maximum chunk size in code points (default: 2000)
  • chunk_overlap - Overlap between chunks (default: 200)
  • size_fn - Custom size function (String.t() -> integer()) (default: nil)

Summary

Functions

Split text into format-aware chunks using TextChunker.

Returns default options for the format-aware chunker.

Types

t()

@type t() :: %Rag.Chunker.FormatAware{
  chunk_overlap: non_neg_integer(),
  chunk_size: pos_integer(),
  format: atom(),
  size_fn: (String.t() -> non_neg_integer()) | nil
}

Functions

chunk(chunker, text, opts)

@spec chunk(t(), String.t(), keyword()) :: [Rag.Chunker.Chunk.t()]

Split text into format-aware chunks using TextChunker.

default_opts()

@spec default_opts() :: keyword()

Returns default options for the format-aware chunker.