Legacy API functions using deprecated patterns.
This module contains deprecated functions that used the old configuration approach. These functions will be removed in v2.0.0.
Users should migrate to the new Kreuzberg module which uses the modern
ExtractionConfig structure and nested configuration maps.
Migration Guide
The major change in v2.0.0 is moving from simple boolean flags and flat
configuration to a structured ExtractionConfig with nested configuration maps.
Old Pattern (Deprecated)
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_ocr(input, "application/pdf", true)
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_chunking(input, "text/plain", 1024, 100)New Pattern (Recommended)
config = %Kreuzberg.ExtractionConfig{
ocr: %{"enabled" => true, "backend" => "tesseract"},
chunking: %{"max_chars" => 1024, "max_overlap" => 100}
}
{:ok, result} = Kreuzberg.extract(input, "application/pdf", config)
Summary
Functions
Extract file using deprecated simple parameter list.
Extract content with deprecated chunking parameters.
Extract content with deprecated boolean OCR parameter.
Extract with keyword list configuration (deprecated format).
Validate extraction request using deprecated format.
Functions
@spec extract_file_legacy(String.t() | Path.t(), String.t() | nil, keyword()) :: {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}
Extract file using deprecated simple parameter list.
This function is deprecated. Use Kreuzberg.extract_file/3 which accepts
modern ExtractionConfig structures.
Parameters
path- File path (String or Path.t())mime_type- MIME type (optional, string or nil)opts- Keyword list of deprecated options::ocr- Boolean to enable OCR:chunk_size- Maximum chunk size:use_cache- Enable caching
Returns
{:ok, ExtractionResult.t()}- Successfully extracted content{:error, reason}- Extraction failed
Deprecated
This function will be removed in v2.0.0. Use the modern API:
config = %ExtractionConfig{
ocr: %{"enabled" => true},
chunking: %{"max_chars" => 1024},
use_cache: true
}
Kreuzberg.extract_file(path, mime_type, config)Examples
# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_file_legacy(
"document.pdf",
"application/pdf",
ocr: true,
chunk_size: 1024,
use_cache: true
)
# Recommended way (new):
config = %ExtractionConfig{
ocr: %{"enabled" => true},
chunking: %{"max_chars" => 1024},
use_cache: true
}
{:ok, result} = Kreuzberg.extract_file("document.pdf", "application/pdf", config)
@spec extract_with_chunking(binary(), String.t(), integer(), integer()) :: {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}
Extract content with deprecated chunking parameters.
This function is deprecated. Use Kreuzberg.extract/3 with the new
ExtractionConfig structure containing a chunking nested configuration.
Parameters
input- Binary document datamime_type- MIME type of the documentchunk_size- Maximum chunk size (deprecated parameter style)overlap- Overlap between chunks (deprecated parameter style)
Returns
{:ok, ExtractionResult.t()}- Successfully extracted and chunked content{:error, reason}- Extraction failed
Deprecated
This function will be removed in v2.0.0. Use:
config = %ExtractionConfig{
chunking: %{"max_chars" => 1024, "max_overlap" => 100}
}
Kreuzberg.extract(input, mime_type, config)Examples
# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_chunking(
pdf_binary,
"application/pdf",
1024,
100
)
# Recommended way (new):
config = %ExtractionConfig{
chunking: %{"max_chars" => 1024, "max_overlap" => 100}
}
{:ok, result} = Kreuzberg.extract(pdf_binary, "application/pdf", config)
@spec extract_with_ocr(binary(), String.t(), boolean()) :: {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}
Extract content with deprecated boolean OCR parameter.
This function is deprecated. Use Kreuzberg.extract/3 with the new
ExtractionConfig structure containing an ocr nested configuration map.
Parameters
input- Binary document datamime_type- MIME type of the documentenable_ocr- Boolean flag to enable OCR (deprecated parameter style)
Returns
{:ok, ExtractionResult.t()}- Successfully extracted content{:error, reason}- Extraction failed
Deprecated
This function will be removed in v2.0.0. Use:
config = %ExtractionConfig{ocr: %{"enabled" => true}}
Kreuzberg.extract(input, mime_type, config)Examples
# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_ocr(pdf_binary, "application/pdf", true)
# Recommended way (new):
config = %ExtractionConfig{ocr: %{"enabled" => true}}
{:ok, result} = Kreuzberg.extract(pdf_binary, "application/pdf", config)
@spec extract_with_options(binary(), String.t(), keyword()) :: {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}
Extract with keyword list configuration (deprecated format).
This function is deprecated. Use Kreuzberg.extract/3 with modern
ExtractionConfig structure.
Deprecated
The keyword list configuration format is deprecated in favor of the
structured ExtractionConfig with nested configuration maps.
Examples
# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_options(
input,
"application/pdf",
use_cache: true,
force_ocr: false,
output_format: "markdown"
)
# Recommended way (new):
config = %ExtractionConfig{
use_cache: true,
force_ocr: false,
output_format: "markdown"
}
{:ok, result} = Kreuzberg.extract(input, "application/pdf", config)
Validate extraction request using deprecated format.
This function is deprecated. Modern validation is built into the extraction functions.
Deprecated
Explicit validation functions are no longer necessary as validation is automatically performed by the extraction functions.