Kreuzberg.ExtractionResult (kreuzberg v4.4.2)

Copy Markdown View Source

Structure representing the result of a document extraction operation.

Matches the Rust ExtractionResult struct.

Fields

  • :content - The main extracted text content
  • :mime_type - The MIME type of the processed document
  • :metadata - Metadata struct with document information
  • :tables - List of extracted tables
  • :detected_languages - List of detected language codes
  • :chunks - Optional list of text chunks with embeddings
  • :images - Optional list of extracted images
  • :pages - Optional list of per-page content
  • :elements - Optional list of semantic elements
  • :ocr_elements - Optional list of OCR elements with positioning and confidence
  • :djot_content - Optional rich Djot content structure
  • :document - Optional hierarchical document structure
  • :extracted_keywords - Optional list of extracted keywords with scores
  • :quality_score - Optional quality score for the extraction (0.0 to 1.0)
  • :processing_warnings - Optional list of warnings generated during processing
  • :annotations - Optional list of PDF annotations (text, highlight, link, etc.)

Summary

Functions

Converts an ExtractionResult struct to a map.

Types

t()

@type t() :: %Kreuzberg.ExtractionResult{
  annotations: [Kreuzberg.PdfAnnotation.t()] | nil,
  chunks: [Kreuzberg.Chunk.t()] | nil,
  content: String.t(),
  detected_languages: [String.t()] | nil,
  djot_content: Kreuzberg.DjotContent.t() | nil,
  document: Kreuzberg.DocumentStructure.t() | nil,
  elements: [Kreuzberg.Element.t()] | nil,
  extracted_keywords: [Kreuzberg.Keyword.t()] | nil,
  images: [Kreuzberg.Image.t()] | nil,
  metadata: Kreuzberg.Metadata.t(),
  mime_type: String.t(),
  ocr_elements: [Kreuzberg.OcrElement.t()] | nil,
  pages: [Kreuzberg.Page.t()] | nil,
  processing_warnings: [Kreuzberg.ProcessingWarning.t()],
  quality_score: float() | nil,
  tables: [Kreuzberg.Table.t()]
}

Functions

new(content, mime_type, metadata \\ %Kreuzberg.Metadata{}, tables \\ [], opts \\ [])

@spec new(
  String.t(),
  String.t(),
  Kreuzberg.Metadata.t() | map(),
  [Kreuzberg.Table.t() | map()],
  keyword()
) :: t()

Creates a new ExtractionResult from extracted data.

Parameters

  • content - The extracted text content
  • mime_type - The MIME type of the document
  • metadata - Document metadata struct or map
  • tables - List of extracted table structs or maps
  • opts - Optional keyword list with additional fields

to_map(result)

@spec to_map(t()) :: map()

Converts an ExtractionResult struct to a map.