Kreuzberg.Helpers (kreuzberg v4.0.8)
View SourceShared helper functions for Kreuzberg extraction modules.
This module provides common utility functions used across multiple modules (Kreuzberg, BatchAPI, CacheAPI) to reduce code duplication and ensure consistent behavior.
Features
- Key normalization for maps (atom/string conversion)
- Value normalization for nested structures
- Configuration validation
- Result struct conversion from native responses
- Statistics key normalization for cache operations
Summary
Functions
Call native extraction function with configuration validation pattern.
Convert a native extraction response map to ExtractionResult struct.
Convert atom, string, or any value to string key.
Recursively normalize map keys to strings.
Normalize keys in statistics maps for cache operations.
Recursively normalize nested values in maps and lists.
Run a list of validators with an optional data parameter.
Validate extraction configuration from various formats.
Functions
@spec call_native( (-> {:ok, any()} | {:error, String.t()}), (map() -> {:ok, any()} | {:error, String.t()}), nil | Kreuzberg.ExtractionConfig.t() | map() | keyword() ) :: {:ok, any()} | {:error, String.t()}
Call native extraction function with configuration validation pattern.
Handles the common pattern of:
- Checking if config is nil (bypass validation and call simple native function)
- Validating config if provided
- Converting config to map
- Calling native function with config options
Parameters
nil_func- 0-arity function to call when config is nilconfig_func- 1-arity function to call with config_mapconfig- Configuration to validate (nil, ExtractionConfig, map, or keyword list)
Returns
{:ok, result}- Native function succeeded{:error, reason}- Configuration invalid or native function failed
Examples
iex> Kreuzberg.Helpers.call_native(
...> fn -> Native.extract(input, mime_type) end,
...> fn cfg_map -> Native.extract_with_options(input, mime_type, cfg_map) end,
...> nil
...> )
{:ok, result_map}
@spec into_result(map()) :: {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}
Convert a native extraction response map to ExtractionResult struct.
Takes a map from the native layer and creates a properly typed ExtractionResult struct, normalizing all keys to strings in the process.
Parameters
map- Raw map from native extraction response
Returns
{:ok, ExtractionResult.t()}- Successfully converted result struct{:error, reason}- Conversion failed (missing required fields)
Examples
iex> native_response = %{
...> "content" => "extracted text",
...> "mime_type" => "application/pdf",
...> metadata: %{pages: 5}
...> }
iex> {:ok, result} = Kreuzberg.Helpers.into_result(native_response)
iex> result.content
"extracted text"
iex> invalid_response = %{"content" => "text"}
iex> {:error, reason} = Kreuzberg.Helpers.into_result(invalid_response)
iex> String.contains?(reason, "mime_type")
true
Convert atom, string, or any value to string key.
Handles conversion of different key types to string format, used for normalizing map keys from native responses.
Parameters
key- Key as atom, binary, or any other type
Returns
- String representation of the key
Examples
iex> Kreuzberg.Helpers.normalize_key(:content)
"content"
iex> Kreuzberg.Helpers.normalize_key("content")
"content"
iex> Kreuzberg.Helpers.normalize_key(123)
"123"
Recursively normalize map keys to strings.
Converts all map keys to strings, handling nested maps and lists. Non-map values are passed through unchanged.
Parameters
map- Map or any value to normalize
Returns
- Map with string keys (for maps)
- Value unchanged (for non-maps)
Examples
iex> Kreuzberg.Helpers.normalize_map_keys(%{a: 1, "b" => 2})
%{"a" => 1, "b" => 2}
iex> Kreuzberg.Helpers.normalize_map_keys(%{"nested" => %{key: "value"}})
%{"nested" => %{"key" => "value"}}
iex> Kreuzberg.Helpers.normalize_map_keys([1, 2, 3])
[1, 2, 3]
Normalize keys in statistics maps for cache operations.
Converts all keys in statistics maps to strings, used specifically for normalizing cache statistics from the native layer.
This is an alias for normalize_map_keys/1 to reduce duplication.
Parameters
map- Map or any value (only maps are normalized)
Returns
- Map with string keys (for maps)
- Value unchanged (for non-maps)
Examples
iex> Kreuzberg.Helpers.normalize_stats_keys(%{total_files: 42, "total_size_mb" => 128.5})
%{"total_files" => 42, "total_size_mb" => 128.5}
iex> Kreuzberg.Helpers.normalize_stats_keys([1, 2, 3])
[1, 2, 3]
Recursively normalize nested values in maps and lists.
Handles normalization of:
- Nested maps: applies key normalization recursively
- Lists: normalizes each element
- Other values: passed through unchanged
Parameters
value- Value to normalize (any type)
Returns
- Normalized value with all nested keys converted to strings
Examples
iex> Kreuzberg.Helpers.normalize_value(%{key: "value"})
%{"key" => "value"}
iex> Kreuzberg.Helpers.normalize_value([%{a: 1}, %{b: 2}])
[%{"a" => 1}, %{"b" => 2}]
iex> Kreuzberg.Helpers.normalize_value("string")
"string"
Run a list of validators with an optional data parameter.
Generic reducer for running validators that may or may not accept a data parameter. Used to reduce duplication in run_validators and run_final_validators patterns.
Parameters
validators- List of validator modules to rundata- Optional data to pass to validators (defaults to nil)
Returns
:ok- All validators passed{:error, reason}- First validator that failed
Examples
iex> Kreuzberg.Helpers.run_validators([MyValidator])
:ok
iex> Kreuzberg.Helpers.run_validators([MyValidator], extraction_result)
:ok
@spec validate_config(nil | Kreuzberg.ExtractionConfig.t() | map() | keyword()) :: {:ok, nil | Kreuzberg.ExtractionConfig.t() | map() | keyword()} | {:error, String.t()}
Validate extraction configuration from various formats.
Accepts nil, ExtractionConfig structs, maps, or keyword lists. Validates structs and passes through other formats for later processing.
Parameters
config- Configuration as:nil- No configuration (valid, returns {:ok, nil})ExtractionConfig.t()- Struct format (validated)map()- Key-value configuration (passed through)keyword()- Keyword list configuration (passed through)
Returns
{:ok, config}- Valid configuration{:error, reason}- Configuration validation failed
Examples
iex> Kreuzberg.Helpers.validate_config(nil)
{:ok, nil}
iex> Kreuzberg.Helpers.validate_config(%Kreuzberg.ExtractionConfig{})
{:ok, %Kreuzberg.ExtractionConfig{...}}
iex> Kreuzberg.Helpers.validate_config(%{"extract_images" => true})
{:ok, %{"extract_images" => true}}
iex> Kreuzberg.Helpers.validate_config(extract_images: true)
{:ok, [extract_images: true]}