View Source Changelog

3.2.1 (2023-11-26)

3.2.0 (2023-09-24)

  • Strict mode: Exception messages of thrown exceptions are now redacted by default to avoid data unintentionally leaking into logs. This behaviour change is not considered to be breaking backwards compatibility since source data presented in exeption messages is not considered part of the CSV public API.
  • Strict mode: Exception messages can be unredacted using the unredact_exceptions option
  • Normal mode: Error messages can be redacted using the redact_errors option
  • Option to (un)redact exception messages [contributed in #122 by @taylor-redden-papa

3.0.5 (2022-12-03)

  • Exclude dialyzer files from library package [contributed in #121 by @milmazz

3.0.4 (2022-11-19)

  • Add missing escape_max_lines to decode options typespec closes #120

3.0.3 (2022-11-04)

  • Ensure that reparsing of lines with stray escape characters does not produce duplicate error output closes #119
  • Deduplication of type specs in #118 contributed by @joseph-lozano
  • Documentation fixes and improvements contributed by @jamesvl in #115

3.0.2 (2022-11-03)

  • Ensure that escaped fields as the last field on the last line without a newline are included in the results - fixes #117 raised by @superhawk610

3.0.1 (2022-10-25)

  • Ensure that stray escape quotes and unterminated escape sequences on a last line without a newline produce errors

3.0.0 (2022-10-25)

  • The parallel parser/lexer with a binary matching parser with better performance.
  • A new :field_transform option allows specifying functionality applied when decoding any field through a function
  • Escape characters can now be specified using the :escape_character option, this Closes #59
  • The library will now reparse lines that follow e.g. an unterminated escape sequence. This ensures that all possible valid rows will be returned in normal mode
  • Encoding checks have been removed because they can either be done using :field_transform or outside the library
  • Better docs

Upgrading from 2.x

  • Parallelism has been removed, alongside its options :num_workers and :worker_work_ratio. You can safely remove them.
  • StrayQuoteError is now StrayEscapeCharacterError. If you catch this error in your code, you need to rename it.
  • The :strip_fields option needs to be replaced with the :field_transform option:
    File.stream!("data.csv") |> CSV.decode(field_transform: &String.trim/1)
  • :validate_row_length now defaults to false. This option produces an error for rows with different length. Set it to true to get the same behaviour as in 2.x
  • :escape_formulas is now :unescape_formulas for decode and decode!. It is still :escape_formulas for encode. Change :escape_formulas to :unescape_formulas in decode calls to get the same behaviour as in 2.x
  • :escape_max_lines now defaults to 10 instead of 1000. To get the same behaviour as in 2.x, use:
    File.stream!("data.csv") |> CSV.decode(escape_max_lines: 1000)
  • :replace has been removed. CSV will now return fields with incorrect encoding as-is. You can use the new :field_transform option to provide a function transforming fields while they are being parsed. This allows to e.g. replace incorrect encoding:
    defp replace_bad_encoding(field) do
      if String.valid?(field) do
        field
      else
        field
        |> String.codepoints()
        |> Enum.map(fn codepoint -> if String.valid?(codepoint), do: codepoint, else: "?" end)
        |> Enum.join()
      end
    end

2.5.0 (2022-09-17)

  • Optional parameter escape_formulas to prevent CSV injection. Fixes #103 reported by @maennchen. Contributed by @maennchen in PR #104.
  • Optional parameter force_quotes to force quotes when encoding contributed by @stuart
  • Bugfix to pass non UTF-8 lines through in normal mode so other lines can be processed, Fixes #107. Contributed by @al2o3cr.
  • Allow to encode keyword lists specifying headers as values, contributed by @michaelchu
  • Better docs thanks to @kianmeng

2.4.1 (2020-09-12)

2.4.0 (2020-09-12)

  • Fix StrayQuoteError not getting passed the correct arguments in strict mode. Fixes #96.
  • When headers are present multiple times and the :headers option is set to true, parse the values into a list. Contributed by @MrAlexLau in PR #97.

2.3.1 (2019-03-30)

2.3.0 (2019-03-17)

2.2.0 (2019-03-03)

  • Make syntax compatible with latest Elixir releases
  • Add validate_row_length: option defaulting to true to allow disabling validation of row length.

2.0.0 (2017-05-29)

  • Make decode return row and error tuples instead of raising errors directly
  • Make old behaviour of raising errors directly available via decode!
  • Improve error messages for escape sequences
  • Rewrite parts of the pipeline to be more modular

1.4.4 (2016-11-12)

1.4.3 (2016-08-27)

1.4.2 (2016-06-20)

1.4.1 (2016-05-21)

  • Fix condition where rows would be dropped when decoding from stateful streams. See #39 reported by @moxley

1.4.0 (2016-04-03)

1.3.3 (2016-03-25)

1.3.2 (2016-03-08)

  • Cleanup, removing some unused defaults in function headers to remove compile time warnings

1.3.1 (2016-03-08)

  • Fix :strip_cells not stripping cells when multiple options are specified - #29 by @tomjoro

1.3.0 (2016-03-01)

  • Now supports linebreaks inside escaped fields (#13)
  • Raises an error when row length mismatches across rows
  • Uses parallel_stream for parallelism

1.2.4 (2016-02-06)

  • Fix encoding of double quotes

1.2.3 (2016-01-19)

  • Fix a condition where headers: true would enumerate the whole file once before parsing

1.2.2 (2016-01-02)

  • Fix default num_pipes argument to evaluate num_pipes dependent on scheduler at runtime
  • Test utf-8 files with BOM
  • Syntax and mix updates for elixir 1.2

1.2.1 (2015-10-17)

  • Decoder performance optimisations

1.2.0 (2015-10-11)

1.1.5 (2015-10-11)

  • Decoder refactor from Stream.resource/3 to Stream.transform/3 in order to get more predictable stream behaviour
  • Rows now get processed in order
  • Fix a bug where stream would get evaluated before being decoded

1.1.4 (2015-09-13)

  • Fix a bug where headers could be out of order

1.1.3 (2015-09-12)

  • Fix a bug where headers could get parsed as the first row

1.1.2 (2015-09-05)

  • Fix a bug where calls to decode with num_pipes: 1 would yield varying results due to leftover state in decoder message queue

1.1.1 (2015-07-14)

  • Rescue from errors in stream producer to get more predictable behaviour in case of failure

1.1.0 (2015-07-12)

  • Better error messages when encountering invalid encodings

1.0.1 (2015-07-11)

  • Indicate consolidate_protocols for better encoding performance

1.0.0 (2015-05-24)

  • Use bytes as separators

0.2.3 (2015-05-24)

  • Add benchmarking

0.2.2 (2015-05-20)

  • Use utf-8 bytes instead of codepoints for multi-byte parsing

0.2.1 (2015-05-20)

  • Fix handling of multi-byte utf-8 characters

0.2.0 (2015-03-25)

  • Implement encoder protocol