View Source CSV.Decoding.Decoder (CSV v3.2.1)

The Decoder CSV module sends lines of delimited values from a stream to the parser and converts rows coming from the CSV parser module to a consumable stream.

Summary

Functions

Decode a stream of comma-separated lines into a stream of rows that are either lists of fields or maps of headers to fields. The Decoder expects line or variable size byte stream input.

Types

@type decode_options() :: CSV.decode_options()

Functions

Link to this function

decode(stream, options \\ [])

View Source
@spec decode(Enumerable.t(), [decode_options()]) :: Enumerable.t()

Decode a stream of comma-separated lines into a stream of rows that are either lists of fields or maps of headers to fields. The Decoder expects line or variable size byte stream input.

Options

These are the options:

  • :separator – The separator token to use, defaults to ?,. Must be a codepoint (syntax: ? + (your separator)).
  • :escape_character – The escape character token to use, defaults to ?". Must be a codepoint (syntax: ? + (your escape character)).
  • :escape_max_lines – The number of lines an escape sequence is allowed to span, defaults to 10.
  • :field_transform – A function with arity 1 that will get called with each field and can apply transformations. Defaults to identity function. This function will get called for every field and therefore should return quickly.
  • :headers – When set to true, will take the first row of the csv and use it as header values. When set to a list, will use the given list as header values. When set to false (default), will use no header values. When set to anything but false, the resulting rows in the matrix will be maps instead of lists.
  • :validate_row_length – When set to true, will take the first row of the csv or its headers and validate that following rows are of the same length. Defaults to false.
  • :escape_formulas – When set to true, will remove formula escaping inserted to prevent CSV Injection.

Examples

Convert a stream with inlined escape sequences into a stream of rows:

iex> ["a,b\n","c,d\n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["c", "d"]]

Convert a stream with custom escape characters into a stream of rows:

iex> ["@a@,@b@\n","@c@,@d@\n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode(escape_character: ?@)
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["c", "d"]]

Convert a line stream with escape sequences into a stream of rows:

iex> ["'@a,'=b\n","'-c,'+d\n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode(unescape_formulas: true)
...> |> Enum.take(2)
[ok: ["@a", "=b"], ok: ["-c", "+d"]]

Trim each field:

iex> [" a , b   \n"," c   ,   d \n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode(field_transform: &String.trim/1)
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["c", "d"]]

Read from a file with a Byte Order Mark (BOM):

iex> "../../../test/fixtures/utf8-with-bom.csv"
...> |> Path.expand(__DIR__)
...> |> File.stream!([:trim_bom])
...> |> CSV.Decoding.Decoder.decode()
...> |> Enum.take(2)
[ok: ["a", "b"], ok: ["d", "e"]]

Replace invalid codepoints:

iex> "../../../test/fixtures/broken-encoding.csv"
...> |> Path.expand(__DIR__)
...> |> File.stream!()
...> |> CSV.Decoding.Decoder.decode(field_transform: fn field ->
...>   if String.valid?(field) do
...>     field
...>   else
...>     field
...>     |> String.codepoints()
...>     |> Enum.map(fn codepoint -> if String.valid?(codepoint), do: codepoint, else: "?" end)
...>     |> Enum.join()
...>   end
...> end)
...> |> Enum.take(2)
[ok: ["a", "b", "c", "?_?"], ok: ["ಠ_ಠ"]]

Map an existing stream of lines separated by a token to a stream of rows with a header row:

iex> ["a;b\n","c;d\n", "e;f\n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode(separator: ?;, headers: true)
...> |> Enum.take(2)
[
  ok: %{"a" => "c", "b" => "d"},
  ok: %{"a" => "e", "b" => "f"}
]

Map an existing stream of lines separated by a token to a stream of rows with a header row with duplications:

iex> ["a;b;b\n","c;d;e\n", "f;g;h\n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode(separator: ?;, headers: true)
...> |> Enum.take(2)
[
  ok: %{"a" => "c", "b" => ["d", "e"]},
  ok: %{"a" => "f", "b" => ["g", "h"]}
]

Map an existing stream of lines separated by a token to a stream of rows with a given header row:

iex> ["a;b\n","c;d\n", "e;f\n"]
...> |> Stream.map(&(&1))
...> |> CSV.Decoding.Decoder.decode(separator: ?;, headers: [:x, :y])
...> |> Enum.take(2)
[
  ok: %{:x => "a", :y => "b"},
  ok: %{:x => "c", :y => "d"}
]

Decode a CSV string:

iex> ["id,name\r\n1,Jane\r\n2,George\r\n3,John"]
...> |> CSV.Decoding.Decoder.decode(headers: true)
...> |> Enum.map(&(&1))
[
  ok: %{"id" => "1", "name" => "Jane"},
  ok: %{"id" => "2", "name" => "George"},
  ok: %{"id" => "3", "name" => "John"}
]