RustyCSV.RFC4180 (RustyCSV v0.3.9)

Copy Markdown View Source

A CSV parser/dumper following RFC 4180 conventions.

This module uses comma (,) as the field separator and double-quote (") as the escape character. It recognizes both CRLF and LF line endings.

This is a drop-in replacement for NimbleCSV.RFC4180.

Quick Start

alias RustyCSV.RFC4180, as: CSV

# Parse CSV (skips headers by default)
CSV.parse_string("name,age\njohn,27\n")
#=> [["john", "27"]]

# Include headers
CSV.parse_string("name,age\njohn,27\n", skip_headers: false)
#=> [["name", "age"], ["john", "27"]]

# Use parallel parsing for large files
CSV.parse_string(large_csv, strategy: :parallel)

# Stream large files with bounded memory
"huge.csv"
|> File.stream!()
|> CSV.parse_stream()
|> Enum.each(&process/1)

Dumping

CSV.dump_to_iodata([["name", "age"], ["john", "27"]])
|> IO.iodata_to_binary()
#=> "name,age\njohn,27\n"

Configuration

This module was defined with:

RustyCSV.define(RustyCSV.RFC4180,
  separator: ",",
  escape: "\"",
  line_separator: "\n",
  newlines: ["\r\n", "\n"],
  strategy: :simd
)

To customize these options, define your own parser with RustyCSV.define/2.

Summary

Functions

Converts an enumerable of rows to iodata in CSV format.

Lazily converts an enumerable of rows to a stream of iodata.

Returns the options used to define this CSV module.

Eagerly parses an enumerable of CSV data into a list of rows.

Lazily parses a stream of CSV data into a stream of rows.

Parses a CSV string into a list of rows.

Converts a stream of arbitrary binary chunks into a line-oriented stream.

Functions

dump_to_iodata(enumerable, opts \\ [])

@spec dump_to_iodata(Enumerable.t(), RustyCSV.dump_options()) :: iodata()

Converts an enumerable of rows to iodata in CSV format.

Returns a single flat binary (valid iodata/0). Unlike NimbleCSV, which returns an iodata list, RustyCSV writes all CSV bytes into one contiguous binary in the NIF for better performance and lower memory use.

Options

  • :strategy - Encoding strategy. By default, uses a single-threaded SIMD-accelerated encoder. Pass strategy: :parallel for multi-threaded encoding via rayon, which is faster for quoting-heavy data.

Examples

# Default encoder (best for most data)
RustyCSV.RFC4180.dump_to_iodata(rows)

# Parallel encoder (best for quoting-heavy data)
RustyCSV.RFC4180.dump_to_iodata(rows, strategy: :parallel)

dump_to_stream(enumerable)

@spec dump_to_stream(Enumerable.t()) :: Enumerable.t()

Lazily converts an enumerable of rows to a stream of iodata.

options()

@spec options() :: keyword()

Returns the options used to define this CSV module.

parse_enumerable(enumerable, opts \\ [])

@spec parse_enumerable(Enumerable.t(), RustyCSV.parse_options()) :: RustyCSV.rows()

Eagerly parses an enumerable of CSV data into a list of rows.

parse_stream(stream, opts \\ [])

@spec parse_stream(Enumerable.t(), RustyCSV.parse_options()) :: Enumerable.t()

Lazily parses a stream of CSV data into a stream of rows.

Options

  • :skip_headers - When true, skips the first row. Defaults to true.
  • :headers - Controls header handling. Defaults to false.
    • false - Return rows as lists (default behavior)
    • true - Use first row as string keys, return maps. :skip_headers is ignored.
    • [atom | string, ...] - Use explicit keys, return maps. First row skipped by default; pass skip_headers: false if no header row.

  • :chunk_size - Bytes per IO read. Defaults to 65536.
  • :batch_size - Rows per batch. Defaults to 1000.
  • :max_buffer_size - Maximum streaming buffer size in bytes. Defaults to 268_435_456 (256 MB). Raises if exceeded during parsing.

parse_string(string, opts \\ [])

@spec parse_string(binary(), RustyCSV.parse_options()) :: RustyCSV.rows() | [map()]

Parses a CSV string into a list of rows.

Options

  • :skip_headers - When true, skips the first row. Defaults to true.
  • :strategy - The parsing strategy. Defaults to :simd.
  • :headers - Controls header handling. Defaults to false.
    • false - Return rows as lists (default behavior)
    • true - Use first row as string keys, return maps. :skip_headers is ignored.
    • [atom | string, ...] - Use explicit keys, return maps. First row skipped by default; pass skip_headers: false if no header row.

to_line_stream(stream)

@spec to_line_stream(Enumerable.t()) :: Enumerable.t()

Converts a stream of arbitrary binary chunks into a line-oriented stream.