LeXtract.Config (lextract v0.1.2)

View Source

Configuration for extraction operations using NimbleOptions for validation.

Examples

iex> config = LeXtract.Config.new(model: "gpt-4", provider: :openai, prompt: "test", max_char_buffer: 2000)
iex> config.model
"gpt-4"

iex> config = LeXtract.Config.default()
iex> config.batch_size
5

Summary

Functions

Returns default configuration.

Converts a keyword list to a Config struct with validation.

Converts a keyword list to a Config struct, raising on error.

Creates and validates configuration from keyword list.

Converts a Config struct to a keyword list.

Validates configuration keyword list or struct.

Validates configuration and raises on error.

Types

options()

@type options() :: [
  prompt: binary(),
  examples: [term()],
  template_file: binary(),
  model: binary(),
  provider: atom(),
  api_key: binary(),
  format: term(),
  fence_output: boolean(),
  use_structured_output: boolean(),
  max_char_buffer: pos_integer(),
  chunk_overlap: non_neg_integer(),
  batch_size: pos_integer(),
  extraction_passes: pos_integer(),
  max_concurrency: pos_integer(),
  temperature: float() | nil,
  max_tokens: pos_integer(),
  timeout: pos_integer(),
  attribute_suffix: binary()
]
  • :prompt (String.t/0) - Extraction prompt/description

  • :examples (list of term/0) - List of example extractions (maps with :text and :extractions keys) The default value is [].

  • :template_file (String.t/0) - Path to template file (.json or .yaml)

  • :model (String.t/0) - Required. LLM model identifier (e.g., 'gpt-4o-mini', 'gemini-2.5-flash')

  • :provider (atom/0) - Required. LLM provider (:openai, :gemini, :anthropic, etc.)

  • :api_key (String.t/0) - API key for the LLM provider

  • :format - Output format for extractions The default value is :yaml.

  • :fence_output (boolean/0) - Expect fenced code blocks in LLM response The default value is false.

  • :use_structured_output (boolean/0) - Use structured output mode (generate_object) The default value is false.

  • :max_char_buffer (pos_integer/0) - Maximum chunk size in characters The default value is 1000.

  • :chunk_overlap (non_neg_integer/0) - Character overlap between chunks The default value is 200.

  • :batch_size (pos_integer/0) - Number of chunks per LLM batch The default value is 5.

  • :extraction_passes (pos_integer/0) - Number of extraction passes for multi-pass extraction The default value is 1.

  • :max_concurrency (pos_integer/0) - Maximum concurrent LLM requests The default value is 8.

  • :temperature - LLM sampling temperature (0.0-1.0) The default value is 0.2.

  • :max_tokens (pos_integer/0) - Maximum tokens in LLM response The default value is 4096.

  • :timeout (pos_integer/0) - Request timeout in milliseconds The default value is 60000.

  • :attribute_suffix (String.t/0) - Suffix for attribute keys in structured output The default value is "_attributes".

t()

@type t() :: %LeXtract.Config{
  api_key: String.t() | nil,
  attribute_suffix: String.t(),
  batch_size: pos_integer(),
  chunk_overlap: non_neg_integer(),
  examples: [map()],
  extraction_passes: pos_integer(),
  fence_output: boolean(),
  format: :json | :yaml,
  max_char_buffer: pos_integer(),
  max_concurrency: pos_integer(),
  max_tokens: pos_integer() | nil,
  model: String.t() | nil,
  prompt: String.t() | nil,
  provider: atom() | nil,
  temperature: float() | nil,
  template_file: String.t() | nil,
  timeout: pos_integer(),
  use_structured_output: boolean()
}

Functions

default()

@spec default() :: t()

Returns default configuration.

Examples

iex> config = LeXtract.Config.default()
iex> config.batch_size
5

from_keyword(opts)

@spec from_keyword(keyword()) :: {:ok, t()} | {:error, Exception.t()}

Converts a keyword list to a Config struct with validation.

This function is useful for maintaining backward compatibility with code that uses keyword lists. It validates the options and returns a Config struct.

Examples

iex> {:ok, config} = LeXtract.Config.from_keyword(model: "gpt-4", provider: :openai, prompt: "test")
iex> config.model
"gpt-4"

iex> {:error, _} = LeXtract.Config.from_keyword(model: "gpt-4")

from_keyword!(opts)

@spec from_keyword!(keyword()) :: t()

Converts a keyword list to a Config struct, raising on error.

Examples

iex> config = LeXtract.Config.from_keyword!(model: "gpt-4", provider: :openai, prompt: "test")
iex> config.model
"gpt-4"

new(opts \\ [])

@spec new(keyword()) :: t()

Creates and validates configuration from keyword list.

Validates options using NimbleOptions and raises NimbleOptions.ValidationError if invalid.

Examples

iex> config = LeXtract.Config.new(model: "gpt-4", provider: :openai, prompt: "test", max_char_buffer: 2000)
iex> config.model
"gpt-4"

iex> LeXtract.Config.new(model: "gpt-4", provider: :openai, prompt: "test", max_char_buffer: -1)
** (NimbleOptions.ValidationError) invalid value for :max_char_buffer option: expected positive integer, got: -1

iex> LeXtract.Config.new(model: "gpt-4", provider: :openai, prompt: "test", temperature: 1.5)
** (NimbleOptions.ValidationError) invalid value for :temperature option: must be a float between 0.0 and 1.0, got: 1.5

iex> LeXtract.Config.new(model: "gpt-4", provider: :openai, prompt: "test", format: :xml)
** (NimbleOptions.ValidationError) invalid value for :format option: expected one of [:json, :yaml], got: :xml

to_keyword(config)

@spec to_keyword(t()) :: keyword()

Converts a Config struct to a keyword list.

This is useful for backward compatibility when functions expect keyword lists.

Examples

iex> config = LeXtract.Config.new(model: "gpt-4", provider: :openai, prompt: "test")
iex> kw = LeXtract.Config.to_keyword(config)
iex> Keyword.get(kw, :model)
"gpt-4"

validate(opts)

@spec validate(keyword() | t()) :: {:ok, t()} | {:error, Exception.t()}

Validates configuration keyword list or struct.

Returns {:ok, validated_config} on success or {:error, validation_error} on failure.

Examples

iex> {:ok, config} = LeXtract.Config.validate(max_char_buffer: 1000, model: "gpt-4", provider: :openai, prompt: "test")
iex> config.max_char_buffer
1000

iex> {:error, error} = LeXtract.Config.validate(max_char_buffer: -1, model: "gpt-4", provider: :openai, prompt: "test")
iex> String.contains?(Exception.message(error), "expected positive integer")
true

iex> {:error, error} = LeXtract.Config.validate(temperature: 1.5, model: "gpt-4", provider: :openai, prompt: "test")
iex> String.contains?(Exception.message(error), "must be a float between 0.0 and 1.0")
true

validate!(opts)

@spec validate!(keyword() | t()) :: t()

Validates configuration and raises on error.

Returns the validated config struct or raises LeXtract.Error.Invalid.Config.

Examples

iex> LeXtract.Config.validate!(max_char_buffer: 1000, model: "gpt-4", provider: :openai, prompt: "test")
%LeXtract.Config{max_char_buffer: 1000, model: "gpt-4", provider: :openai, prompt: "test"}

iex> LeXtract.Config.validate!(max_char_buffer: -1, model: "gpt-4", provider: :openai)
** (LeXtract.Error.Invalid.Config) Configuration validation failed: invalid value for :max_char_buffer option: expected positive integer, got: -1