Kreuzberg.LegacyAPI (kreuzberg v4.4.5)

Copy Markdown View Source

Legacy API functions using deprecated patterns.

This module contains deprecated functions that used the old configuration approach. These functions will be removed in v2.0.0.

Users should migrate to the new Kreuzberg module which uses the modern ExtractionConfig structure and nested configuration maps.

Migration Guide

The major change in v2.0.0 is moving from simple boolean flags and flat configuration to a structured ExtractionConfig with nested configuration maps.

Old Pattern (Deprecated)

{:ok, result} = Kreuzberg.LegacyAPI.extract_with_ocr(input, "application/pdf", true)
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_chunking(input, "text/plain", 1024, 100)
config = %Kreuzberg.ExtractionConfig{
  ocr: %{"enabled" => true, "backend" => "tesseract"},
  chunking: %{"max_chars" => 1024, "max_overlap" => 100}
}
{:ok, result} = Kreuzberg.extract(input, "application/pdf", config)

See: https://docs.kreuzberg.io/v1-to-v2-migration

Summary

Functions

Extract file using deprecated simple parameter list.

Extract content with deprecated chunking parameters.

Extract content with deprecated boolean OCR parameter.

Extract with keyword list configuration (deprecated format).

Validate extraction request using deprecated format.

Functions

extract_file_legacy(path, mime_type \\ nil, opts \\ [])

This function is deprecated. Use Kreuzberg.extract_file/3 with ExtractionConfig struct. Removes in v2.0.0..
@spec extract_file_legacy(String.t() | Path.t(), String.t() | nil, keyword()) ::
  {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}

Extract file using deprecated simple parameter list.

This function is deprecated. Use Kreuzberg.extract_file/3 which accepts modern ExtractionConfig structures.

Parameters

  • path - File path (String or Path.t())
  • mime_type - MIME type (optional, string or nil)
  • opts - Keyword list of deprecated options:
    • :ocr - Boolean to enable OCR
    • :chunk_size - Maximum chunk size
    • :use_cache - Enable caching

Returns

  • {:ok, ExtractionResult.t()} - Successfully extracted content
  • {:error, reason} - Extraction failed

Deprecated

This function will be removed in v2.0.0. Use the modern API:

config = %ExtractionConfig{
  ocr: %{"enabled" => true},
  chunking: %{"max_chars" => 1024},
  use_cache: true
}
Kreuzberg.extract_file(path, mime_type, config)

Examples

# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_file_legacy(
  "document.pdf",
  "application/pdf",
  ocr: true,
  chunk_size: 1024,
  use_cache: true
)

# Recommended way (new):
config = %ExtractionConfig{
  ocr: %{"enabled" => true},
  chunking: %{"max_chars" => 1024},
  use_cache: true
}
{:ok, result} = Kreuzberg.extract_file("document.pdf", "application/pdf", config)

extract_with_chunking(input, mime_type, chunk_size, overlap)

This function is deprecated. Use Kreuzberg.extract/3 with ExtractionConfig.chunking map instead. Removes in v2.0.0..
@spec extract_with_chunking(binary(), String.t(), integer(), integer()) ::
  {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}

Extract content with deprecated chunking parameters.

This function is deprecated. Use Kreuzberg.extract/3 with the new ExtractionConfig structure containing a chunking nested configuration.

Parameters

  • input - Binary document data
  • mime_type - MIME type of the document
  • chunk_size - Maximum chunk size (deprecated parameter style)
  • overlap - Overlap between chunks (deprecated parameter style)

Returns

  • {:ok, ExtractionResult.t()} - Successfully extracted and chunked content
  • {:error, reason} - Extraction failed

Deprecated

This function will be removed in v2.0.0. Use:

config = %ExtractionConfig{
  chunking: %{"max_chars" => 1024, "max_overlap" => 100}
}
Kreuzberg.extract(input, mime_type, config)

Examples

# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_chunking(
  pdf_binary,
  "application/pdf",
  1024,
  100
)

# Recommended way (new):
config = %ExtractionConfig{
  chunking: %{"max_chars" => 1024, "max_overlap" => 100}
}
{:ok, result} = Kreuzberg.extract(pdf_binary, "application/pdf", config)

extract_with_ocr(input, mime_type, enable_ocr)

This function is deprecated. Use Kreuzberg.extract/3 with ExtractionConfig.ocr map instead. Removes in v2.0.0..
@spec extract_with_ocr(binary(), String.t(), boolean()) ::
  {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}

Extract content with deprecated boolean OCR parameter.

This function is deprecated. Use Kreuzberg.extract/3 with the new ExtractionConfig structure containing an ocr nested configuration map.

Parameters

  • input - Binary document data
  • mime_type - MIME type of the document
  • enable_ocr - Boolean flag to enable OCR (deprecated parameter style)

Returns

  • {:ok, ExtractionResult.t()} - Successfully extracted content
  • {:error, reason} - Extraction failed

Deprecated

This function will be removed in v2.0.0. Use:

config = %ExtractionConfig{ocr: %{"enabled" => true}}
Kreuzberg.extract(input, mime_type, config)

Examples

# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_ocr(pdf_binary, "application/pdf", true)

# Recommended way (new):
config = %ExtractionConfig{ocr: %{"enabled" => true}}
{:ok, result} = Kreuzberg.extract(pdf_binary, "application/pdf", config)

extract_with_options(input, mime_type, opts \\ [])

This function is deprecated. Use Kreuzberg.extract/3 with ExtractionConfig struct. Removes in v2.0.0..
@spec extract_with_options(binary(), String.t(), keyword()) ::
  {:ok, Kreuzberg.ExtractionResult.t()} | {:error, String.t()}

Extract with keyword list configuration (deprecated format).

This function is deprecated. Use Kreuzberg.extract/3 with modern ExtractionConfig structure.

Deprecated

The keyword list configuration format is deprecated in favor of the structured ExtractionConfig with nested configuration maps.

Examples

# Deprecated way (old):
{:ok, result} = Kreuzberg.LegacyAPI.extract_with_options(
  input,
  "application/pdf",
  use_cache: true,
  force_ocr: false,
  output_format: "markdown"
)

# Recommended way (new):
config = %ExtractionConfig{
  use_cache: true,
  force_ocr: false,
  output_format: "markdown"
}
{:ok, result} = Kreuzberg.extract(input, "application/pdf", config)

validate_extraction_request(input, mime_type, opts \\ [])

This function is deprecated. Validation is now automatic in extraction functions. Removes in v2.0.0..
@spec validate_extraction_request(binary(), String.t(), keyword()) ::
  :ok | {:error, String.t()}

Validate extraction request using deprecated format.

This function is deprecated. Modern validation is built into the extraction functions.

Deprecated

Explicit validation functions are no longer necessary as validation is automatically performed by the extraction functions.