Structure representing the result of a document extraction operation.
Matches the Rust ExtractionResult struct.
Fields
:content- The main extracted text content:mime_type- The MIME type of the processed document:metadata- Metadata struct with document information:tables- List of extracted tables:detected_languages- List of detected language codes:chunks- Optional list of text chunks with embeddings:images- Optional list of extracted images:pages- Optional list of per-page content:elements- Optional list of semantic elements:ocr_elements- Optional list of OCR elements with positioning and confidence:djot_content- Optional rich Djot content structure:document- Optional hierarchical document structure:extracted_keywords- Optional list of extracted keywords with scores:quality_score- Optional quality score for the extraction (0.0 to 1.0):processing_warnings- Optional list of warnings generated during processing:annotations- Optional list of PDF annotations (text, highlight, link, etc.)
Summary
Functions
Creates a new ExtractionResult from extracted data.
Converts an ExtractionResult struct to a map.
Types
@type t() :: %Kreuzberg.ExtractionResult{ annotations: [Kreuzberg.PdfAnnotation.t()] | nil, chunks: [Kreuzberg.Chunk.t()] | nil, content: String.t(), detected_languages: [String.t()] | nil, djot_content: Kreuzberg.DjotContent.t() | nil, document: Kreuzberg.DocumentStructure.t() | nil, elements: [Kreuzberg.Element.t()] | nil, extracted_keywords: [Kreuzberg.Keyword.t()] | nil, images: [Kreuzberg.Image.t()] | nil, metadata: Kreuzberg.Metadata.t(), mime_type: String.t(), ocr_elements: [Kreuzberg.OcrElement.t()] | nil, pages: [Kreuzberg.Page.t()] | nil, processing_warnings: [Kreuzberg.ProcessingWarning.t()], quality_score: float() | nil, tables: [Kreuzberg.Table.t()] }
Functions
@spec new( String.t(), String.t(), Kreuzberg.Metadata.t() | map(), [Kreuzberg.Table.t() | map()], keyword() ) :: t()
Creates a new ExtractionResult from extracted data.
Parameters
content- The extracted text contentmime_type- The MIME type of the documentmetadata- Document metadata struct or maptables- List of extracted table structs or mapsopts- Optional keyword list with additional fields
Converts an ExtractionResult struct to a map.