View Source CSV (CSV v3.0.2)
RFC 4180 compliant CSV parsing and encoding for Elixir. Allows to specify other separators, so it could also be named: TSV, but it isn't.
Link to this section Summary
Functions
Decode a stream of comma-separated lines into a stream of tuples. Decoding errors will be inlined into the stream.
Decode a stream of comma-separated lines into a stream of tuples. Errors when decoding will get raised immediately.
Encode a table stream into a stream of RFC 4180 compliant CSV lines for writing to a file or other IO.
Link to this section Types
Link to this section Functions
@spec decode(Enumerable.t(), [decode_options()]) :: Enumerable.t()
Decode a stream of comma-separated lines into a stream of tuples. Decoding errors will be inlined into the stream.
options
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:escape_character
– The escape character token to use, defaults to?"
. Must be a codepoint (syntax: ? + (your escape character)).:field_transform
– A function with arity 1 that will get called with each field and can apply transformations. Defaults to identity function. This function will get called for every field and therefore should return quickly.:headers
– When set totrue
, will take the first row of the csv and use it as header values. When set to a list, will use the given list as header values. When set tofalse
(default), will use no header values. When set to anything butfalse
, the resulting rows in the matrix will be maps instead of lists.:validate_row_length
– When set totrue
, will take the first row of the csv or its headers and validate that following rows are of the same length. Defaults tofalse
.:unescape_formulas
– When set totrue
, will remove formula escaping inserted to prevent CSV Injection.
examples
Examples
Convert a filestream into a stream of rows in order of the given stream:
iex> "../test/fixtures/docs/valid.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.decode
iex> |> Enum.take(2)
[ok: ["a","b","c"], ok: ["d","e","f"]]
Read from a file with a Byte Order Mark (BOM):
iex> "../test/fixtures/utf8-with-bom.csv"
...> |> Path.expand(__DIR__)
...> |> File.stream!([:trim_bom])
...> |> CSV.decode()
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["d", "e"]]
Errors will show up as error tuples:
iex> "../test/fixtures/docs/escape-errors.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!
iex> |> CSV.decode
iex> |> Enum.take(2)
[
ok: ["a","b","c"],
error: "Escape sequence started on line 2:\n\n\"d,e,f\n\ndid not terminate before the stream halted. Parsing will continue on line 3.\n"
]
Map an existing stream of lines separated by a token to a stream of rows with a header row:
iex> ["a;b\n","c;d\n", "e;f\n"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode(separator: ?;, headers: true)
iex> |> Enum.take(2)
[
ok: %{"a" => "c", "b" => "d"},
ok: %{"a" => "e", "b" => "f"}
]
Map a stream with custom escape characters:
iex> ["@a@,@b@\n","@c@,@d@\n"]
...> |> Stream.map(&(&1))
...> |> CSV.decode(escape_character: ?@)
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["c", "d"]]
Map a stream with custom separator characters:
iex> ["a;b\n","c;d\n"]
...> |> Stream.map(&(&1))
...> |> CSV.decode(separator: ?;)
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["c", "d"]]
Trim each field:
iex> [" a , b \n"," c , d \n"]
...> |> Stream.map(&(&1))
...> |> CSV.decode(field_transform: &String.trim/1)
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["c", "d"]]
Map an existing stream of lines separated by a token to a stream of rows with a given header row:
iex> ["a;b\n","c;d\n", "e;f\n"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode(separator: ?;, headers: [:x, :y])
iex> |> Enum.take(2)
[
ok: %{:x => "a", :y => "b"},
ok: %{:x => "c", :y => "d"}
]
@spec decode!(Enumerable.t(), [decode_options()]) :: Enumerable.t()
Decode a stream of comma-separated lines into a stream of tuples. Errors when decoding will get raised immediately.
options
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:escape_character
– The escape character token to use, defaults to?"
. Must be a codepoint (syntax: ? + (your escape character)).:field_transform
– A function with arity 1 that will get called with each field and can apply transformations. Defaults to identity function. This function will get called for every field and therefore should return quickly.:headers
– When set totrue
, will take the first row of the csv and use it as header values. When set to a list, will use the given list as header values. When set tofalse
(default), will use no header values. When set to anything butfalse
, the resulting rows in the matrix will be maps instead of lists.:validate_row_length
– When set totrue
, will take the first row of the csv or its headers and validate that following rows are of the same length. Will raise an error if validation fails. Defaults tofalse
.:unescape_formulas
– When set totrue
, will remove formula escaping inserted to prevent CSV Injection.
examples
Examples
Convert a filestream into a stream of rows in order of the given stream:
iex> "../test/fixtures/docs/valid.csv"
iex> |> Path.expand(__DIR__)
iex> |> File.stream!()
iex> |> CSV.decode!()
iex> |> Enum.take(2)
[["a","b","c"], ["d","e","f"]]
Read from a file with a Byte Order Mark (BOM):
iex> "../test/fixtures/utf8-with-bom.csv"
...> |> Path.expand(__DIR__)
...> |> File.stream!([:trim_bom])
...> |> CSV.decode!()
...> |> Enum.take(2)
[["a", "b"], ["d", "e"]]
Map an existing stream of lines separated by a token to a stream of rows with a header row:
iex> ["a;b\n","c;d\n", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode!(separator: ?;, headers: true)
iex> |> Enum.take(2)
[
%{"a" => "c", "b" => "d"},
%{"a" => "e", "b" => "f"}
]
Map a stream with custom escape characters:
iex> ["@a@,@b@\n","@c@,@d@\n"]
...> |> Stream.map(&(&1))
...> |> CSV.decode!(escape_character: ?@)
...> |> Enum.take(2)
[["a", "b"], ["c", "d"]]
Map an existing stream of lines separated by a token to a stream of rows with a given header row:
iex> ["a;b\n","c;d\n", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode!(separator: ?;, headers: [:x, :y])
iex> |> Enum.take(2)
[
%{:x => "a", :y => "b"},
%{:x => "c", :y => "d"}
]
Trim each field:
iex> [" a , b \n"," c , d \n"]
...> |> Stream.map(&(&1))
...> |> CSV.decode!(field_transform: &String.trim/1)
...> |> Enum.take(2)
[["a", "b"], ["c", "d"]]
Replace invalid codepoints:
iex> "../test/fixtures/broken-encoding.csv"
...> |> Path.expand(__DIR__)
...> |> File.stream!()
...> |> CSV.decode!(field_transform: fn field ->
...> if String.valid?(field) do
...> field
...> else
...> field
...> |> String.codepoints()
...> |> Enum.map(fn codepoint -> if String.valid?(codepoint), do: codepoint, else: "?" end)
...> |> Enum.join()
...> end
...> end)
...> |> Enum.take(2)
[["a", "b", "c", "?_?"], ["ಠ_ಠ"]]
@spec encode(Enumerable.t(), [encode_options()]) :: Enumerable.t()
Encode a table stream into a stream of RFC 4180 compliant CSV lines for writing to a file or other IO.
options
Options
These are the options:
:separator
– The separator token to use, defaults to?,
. Must be a codepoint (syntax: ? + (your separator)).:escape_character
– The escape character token to use, defaults to?"
. Must be a codepoint (syntax: ? + (your escape character)).:delimiter
– The delimiter token to use, defaults to\r\n
. Must be a string.:force_escaping – When set to
true, will escape fields even if they do not contain characters that require escaping *
:escape_formulas – When set totrue
, will escape formulas to prevent CSV Injection.
examples
Examples
Convert a stream of rows with fields into a stream of lines:
iex> [~w(a b), ~w(c d)]
iex> |> CSV.encode
iex> |> Enum.take(2)
["a,b\r\n", "c,d\r\n"]
Convert a stream of rows with fields with escape sequences into a stream of lines:
iex> [["a\nb", "\tc"], ["de", "\tf\""]]
iex> |> CSV.encode(separator: ?\t, delimiter: "\n")
iex> |> Enum.take(2)
["\"a\nb\"\t\"\tc\"\n", "de\t\"\tf\"\"\"\n"]
Convert a stream of rows with fields into a stream of lines forcing escaping with a custom character:
iex> [~w(a b), ~w(c d)]
iex> |> CSV.encode(force_escaping: true, escape_character: ?@)
iex> |> Enum.take(2)
["@a@,@b@\r\n", "@c@,@d@\r\n"]
Convert a stream of rows with fields with formulas into a stream of lines:
iex> [~w(@a =b), ~w(-c +d)]
iex> |> CSV.encode(escape_formulas: true)
iex> |> Enum.take(2)
["\"'@a\",\"'=b\"\r\n", "\"'-c\",\"'+d\"\r\n"]