FormatParser.Data
(format_parser v2.14.0)
Copy Markdown
A Data struct and functions for parsing data file formats.
The Data struct contains the fields format, nature, and intrinsics.
Supported Formats
| Format | Extension | Description |
|---|---|---|
:pqt | .parquet | Apache Parquet columnar format |
:sqlite3 | .db, .sqlite | SQLite 3 database |
:duckdb | .duckdb | DuckDB database |
:arrow | .arrow | Apache Arrow IPC file format |
:feather | .feather | Feather V1 format |
Examples
iex> {:ok, file} = File.read("data.parquet")
iex> result = FormatParser.Data.parse(file)
%FormatParser.Data{format: :pqt, nature: :data, intrinsics: %{}}
Summary
Functions
Parses binary data to detect data file formats.
Types
Functions
Parses binary data to detect data file formats.
This function attempts to identify data formats by examining magic bytes at the beginning of the binary content.
Arguments
input- Can be one of:{:error, binary}- A tuple containing binary file content (used in parser chain)binary- Raw binary file contentany- Any other value is returned as-is (pass-through for parser chain)
Returns
%FormatParser.Data{}- When a supported data format is detected{:error, binary}- When the format is not recognized (for parser chain)- The input unchanged - When input is neither a binary nor an error tuple
Examples
iex> {:ok, file} = File.read("priv/test.parquet")
iex> FormatParser.Data.parse(file)
%FormatParser.Data{format: :pqt, nature: :data, intrinsics: %{}}
iex> FormatParser.Data.parse({:error, <<80, 65, 82, 49, 0>>})
%FormatParser.Data{format: :pqt, nature: :data, intrinsics: %{}}
iex> FormatParser.Data.parse(%FormatParser.Image{})
%FormatParser.Image{}