Changelog
Copy Markdownv0.2.2 (2026-01-29)
Fixed
- NIF safety: Replaced
.unwrap()calls inencode_batchwith proper error propagation viaNifResult, preventing potential BEAM crashes on memory allocation failure - Documentation: Removed unsupported HZ encoding from README (not in WHATWG/encoding_rs)
- Documentation: Clarified "200+ encodings" claim — the library supports 40 distinct WHATWG encodings with 200+ label aliases
- Documentation: Fixed
Decoder.stream/2docs that incorrectly claimed 1:1 output-to-input correspondence; the stream may emit an extra element when flushing buffered bytes
Improved
- Rust DRY refactor: Extracted shared
decoder_decode_chunk_implto eliminate duplicated logic betweendecoder_decode_chunkanddecoder_decode_chunk_dirtyNIF functions - Elixir DRY refactor: Extracted
route_nif/4helper to eliminate duplicated dirty-scheduler routing inencode/2anddecode/2 - Elixir DRY refactor: Extracted
normalize_result/1helper to unify error normalization acrossencode/2,decode/2,encode_batch/1, anddecode_batch/1
Testing
- Added stream flush test verifying extra element emission for incomplete trailing multibyte sequences
- Added stream flush test verifying no extra element when stream ends cleanly
- Added
stream_with_errors/2flush test verifyinghad_errors: trueon flushed replacement characters
v0.2.1 (2026-01-22)
Fixed
- Fixed precompiled binary checksums that were mismatched with release artifacts
Documentation
- Added Library Comparison Guide with benchmarks against codepagex and iconv
- Added benchmark results to README showing 3-15x performance improvement over alternatives
- Added
bench/comparison_bench.exsbenchmark suite for reproducing results
v0.2.0 (2026-01-22)
Added
Batch processing API - Process multiple items in a single NIF call for improved throughput
EncodingRs.decode_batch/1- Decode multiple{binary, encoding}tuplesEncodingRs.encode_batch/1- Encode multiple{string, encoding}tuples- Always uses dirty CPU schedulers (see Batch Processing Guide)
Configurable dirty threshold - The threshold for switching to dirty schedulers is now configurable via
config.exs:config :encoding_rs, dirty_threshold: 128 * 1024Default remains 64KB. See documentation for guidance on increasing vs decreasing.
Documentation
- Added Batch Processing Guide with usage examples, performance tips, and known limitations
v0.1.0 (2026-01-22)
Initial release of encoding_rs, a fork of excoding with significant improvements.
Why This Fork?
The original excoding package used the encoding Rust crate (unmaintained since 2018). This fork replaces it with encoding_rs - Mozilla's actively maintained encoding library used by Firefox.
Features
- High-performance encoding/decoding using Rust's encoding_rs library
- Streaming decoder (
EncodingRs.Decoder): Stateful decoder for chunked data that properly handles multibyte characters split across chunk boundariesEncodingRs.Decoder.new/1- Create a stateful decoderEncodingRs.Decoder.decode_chunk/3- Decode a chunk with state preservationEncodingRs.Decoder.stream/2- Stream transformer for use withFile.stream!/3
- BOM detection: Detect encoding from Byte Order Marks
detect_bom/1- Detect BOM and return encoding name and lengthdetect_and_strip_bom/1- Detect and strip BOM from data
- Dirty schedulers: Operations on binaries >64KB use dirty CPU schedulers
- Precompiled binaries: Available for 10 platforms across NIF versions 2.15-2.17
API
# One-shot encoding/decoding
{:ok, string} = EncodingRs.decode(binary, "shift_jis")
{:ok, binary} = EncodingRs.encode(string, "windows-1252")
# Bang variants
string = EncodingRs.decode!(binary, "shift_jis")
binary = EncodingRs.encode!(string, "windows-1252")
# Streaming decoder for chunked data
File.stream!("data.txt", [], 4096)
|> EncodingRs.Decoder.stream("shift_jis")
|> Enum.join()
# BOM detection
{:ok, "UTF-8", 3} = EncodingRs.detect_bom(<<0xEF, 0xBB, 0xBF, "hello">>)
# Utilities
EncodingRs.encoding_exists?("utf-8") # true
EncodingRs.canonical_name("latin1") # {:ok, "windows-1252"}
EncodingRs.list_encodings() # ["UTF-8", "Shift_JIS", ...]Supported Encodings
All encodings from the WHATWG Encoding Standard:
- UTF-8, UTF-16LE, UTF-16BE
- Windows code pages (874, 1250-1258)
- ISO-8859 family (1-16)
- Asian: Shift_JIS, EUC-JP, ISO-2022-JP, EUC-KR, GBK, GB18030, Big5
- And more
Acknowledgments
- excoding - Original project by Kevin Seidel
- encoding_rs - Mozilla's Rust encoding library