A spreadsheet-compatible parser using UTF-16 Little Endian encoding.
This module uses tab (\t) as the field separator and double-quote (")
as the escape character. It handles UTF-16 LE encoding with BOM, which is
the format commonly used by spreadsheet applications like Microsoft Excel.
This is a drop-in replacement for NimbleCSV.Spreadsheet.
Quick Start
alias RustyCSV.Spreadsheet
# Parse UTF-16 LE data (with BOM)
Spreadsheet.parse_string(utf16_data, skip_headers: false)
#=> [["name", "age"], ["john", "27"]]
# Dump to UTF-16 LE format (includes BOM)
Spreadsheet.dump_to_iodata([["name", "age"], ["john", "27"]])
|> IO.iodata_to_binary()Configuration
This module was defined with:
RustyCSV.define(RustyCSV.Spreadsheet,
separator: "\t",
escape: "\"",
encoding: {:utf16, :little},
trim_bom: true,
dump_bom: true
)
Summary
Functions
Converts an enumerable of rows to iodata in CSV format.
Lazily converts an enumerable of rows to a stream of iodata.
Returns the options used to define this CSV module.
Eagerly parses an enumerable of CSV data into a list of rows.
Lazily parses a stream of CSV data into a stream of rows.
Parses a CSV string into a list of rows.
Converts a stream of arbitrary binary chunks into a line-oriented stream.
Functions
@spec dump_to_iodata(Enumerable.t(), RustyCSV.dump_options()) :: iodata()
Converts an enumerable of rows to iodata in CSV format.
Returns a single flat binary (valid iodata/0). Unlike NimbleCSV,
which returns an iodata list, RustyCSV writes all CSV bytes into one
contiguous binary in the NIF for better performance and lower memory use.
Options
:strategy- Encoding strategy. By default, uses a single-threaded SIMD-accelerated encoder. Passstrategy: :parallelfor multi-threaded encoding via rayon, which is faster for quoting-heavy data.
Examples
# Default encoder (best for most data)
RustyCSV.Spreadsheet.dump_to_iodata(rows)
# Parallel encoder (best for quoting-heavy data)
RustyCSV.Spreadsheet.dump_to_iodata(rows, strategy: :parallel)
@spec dump_to_stream(Enumerable.t()) :: Enumerable.t()
Lazily converts an enumerable of rows to a stream of iodata.
@spec options() :: keyword()
Returns the options used to define this CSV module.
@spec parse_enumerable(Enumerable.t(), RustyCSV.parse_options()) :: RustyCSV.rows()
Eagerly parses an enumerable of CSV data into a list of rows.
@spec parse_stream(Enumerable.t(), RustyCSV.parse_options()) :: Enumerable.t()
Lazily parses a stream of CSV data into a stream of rows.
Options
:skip_headers- Whentrue, skips the first row. Defaults totrue.:headers- Controls header handling. Defaults tofalse.false- Return rows as lists (default behavior)true- Use first row as string keys, return maps.:skip_headersis ignored.[atom | string, ...]- Use explicit keys, return maps. First row skipped by default; passskip_headers: falseif no header row.
:chunk_size- Bytes per IO read. Defaults to65536.:batch_size- Rows per batch. Defaults to1000.:max_buffer_size- Maximum streaming buffer size in bytes. Defaults to268_435_456(256 MB). Raises if exceeded during parsing.
@spec parse_string(binary(), RustyCSV.parse_options()) :: RustyCSV.rows() | [map()]
Parses a CSV string into a list of rows.
Options
:skip_headers- Whentrue, skips the first row. Defaults totrue.:strategy- The parsing strategy. Defaults to:simd.:headers- Controls header handling. Defaults tofalse.false- Return rows as lists (default behavior)true- Use first row as string keys, return maps.:skip_headersis ignored.[atom | string, ...]- Use explicit keys, return maps. First row skipped by default; passskip_headers: falseif no header row.
Input is expected in {:utf16, :little} encoding and will be converted to UTF-8 for parsing.
@spec to_line_stream(Enumerable.t()) :: Enumerable.t()
Converts a stream of arbitrary binary chunks into a line-oriented stream.