Structure representing document metadata extracted from files.
Matches the Rust Metadata struct. Note that format and additional
use #[serde(flatten)] in Rust, so their fields appear at the root level
of the serialized JSON.
Fields
:title- Document title:subject- Document subject or description:authors- List of author names:keywords- List of keywords:language- Primary language (ISO 639-1 code):created_at- Creation date (ISO 8601):modified_at- Last modification date (ISO 8601):created_by- Application that created the document:modified_by- Application that last modified the document:pages- Page structure information:format- Format-specific metadata (flattened from Rust):image_preprocessing- Image preprocessing metadata:json_schema- JSON schema if applicable:error- Error metadata if extraction partially failed:category- Document category classification:tags- List of document tags:document_version- Version of the document:abstract_text- Abstract or summary of the document:output_format- Output format used for extraction:additional- Additional metadata fields (flattened from Rust)
Summary
Types
@type t() :: %Kreuzberg.Metadata{ abstract_text: String.t() | nil, additional: map(), authors: [String.t()] | nil, category: String.t() | nil, created_at: String.t() | nil, created_by: String.t() | nil, document_version: String.t() | nil, error: Kreuzberg.ErrorMetadata.t() | nil, extraction_duration_ms: non_neg_integer() | nil, format: map() | nil, image_preprocessing: Kreuzberg.ImagePreprocessingMetadata.t() | nil, json_schema: map() | nil, keywords: [String.t()] | nil, language: String.t() | nil, modified_at: String.t() | nil, modified_by: String.t() | nil, output_format: String.t() | nil, pages: Kreuzberg.PageStructure.t() | nil, subject: String.t() | nil, tags: [String.t()] | nil, title: String.t() | nil }
Functions
Creates a Metadata struct from a map.
Handles Rust's #[serde(flatten)] by classifying keys into format fields,
known metadata fields, and additional (catch-all) fields.
Examples
iex> Kreuzberg.Metadata.from_map(%{"title" => "Report", "authors" => ["Alice"]})
%Kreuzberg.Metadata{title: "Report", authors: ["Alice"]}
Converts a Metadata struct to a map.
Re-flattens format and additional back into the root map to match
the Rust serialization format.
Examples
iex> meta = %Kreuzberg.Metadata{title: "Report", format: %{"format_type" => "pdf"}}
iex> map = Kreuzberg.Metadata.to_map(meta)
iex> map["title"]
"Report"
iex> map["format_type"]
"pdf"