View Source Changelog

Unicode String v1.4.1

This is the changelog for Unicode String v1.4.1 released on March 14th, 2024. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Fix performance regressing in Uncode.String.Break.next/4. Added the script bench/next.exs to allow for regression testing. Thanks to @mntns for the report. Closes #6.

Unicode String v1.4.0

This is the changelog for Unicode String v1.4.0 released on March 10th, 2024. For older changelogs please consult the release tag on GitHub

Enhancements

  • Adds dictionary-based work breaking for Chinese (zh, zh-Hant, zh-Hans, zh-Hant-HK, yue, yue-Hans), Japanese (ja), Thai (th), Lao (lo), Khmer (km) and Burmese (my). These languages don't typically use whitespace to separate words so a dictionary lookup is more appropriate - although not perfect. The same dictionary is used for Chinese and Japanese. The dictionaries implemented are those used in the CLDR since they are under an open source license and also for consistency with ICU. Note that these dictionaries need to be downloaded with mix unicode.string.download.dictionaries prior to use. Each dictionary will be parsed and loaded into persistent_term on demand. Each dictionary has a sizable memory footprint as measured by :persistent_term.info/0:
DictionaryMemory Mb
Chinese104.8
Thai9.6
Lao11.4
Khmer38.8
Burmese23.1

Unicode String v1.3.1

This is the changelog for Unicode String v1.3.1 released on March 6th, 2024. For older changelogs please consult the release tag on GitHub

Bug Fixes

Unicode String v1.3.0

This is the changelog for Unicode String v1.3.0 released on February 27th, 2024. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Fix case folding for codepoints that fold to themselves.

Enhancements

Unicode String v1.2.1

This is the changelog for Unicode String v1.2.1 released on June 2nd, 2023. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Resolve segments dir at runtime, not compile time. Thanks to @crkent for the report. Closes #4.

Unicode String v1.2.0

This is the changelog for Unicode String v1.2.0 released on March 14th, 2023. For older changelogs please consult the release tag on GitHub

Enhancements

Unicode String v1.1.0

This is the changelog for Unicode String v1.1.0 released on September 21st, 2022. For older changelogs please consult the release tag on GitHub

Enhancements

  • Updates the segmentation supplemental data (including locales) for CLDR. This adds the "sv" and "fi" locale data for sentence break suppressions.

Unicode String v1.0.1

This is the changelog for Unicode String v1.0.1 released on September 15th, 2021. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Woops, the priv/segments directory was not included in the build artifact

Unicode String v1.0.0

This is the changelog for Unicode String v1.0.0 released on September 14th, 2021. For older changelogs please consult the release tag on GitHub

Enhancements

Unicode String v0.3.0

This is the changelog for Unicode String v0.3.0 released on October 11th, 2020. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Correct deps and docs to align with Elixir 1.11 and recent releases of ex_unicode.

Unicode String v0.2.0

This is the changelog for Unicode String v0.2.0 released on July 12th, 2020. For older changelogs please consult the release tag on GitHub

Enhancements

This release implements the Unicode break rules for graphemes, words, lines (word-wrapping) and sentences.

Unicode String v0.1.0

This is the changelog for Unicode String v0.1.0 released on May 17th, 2020. For older changelogs please consult the release tag on GitHub

Enhancements

  • Initial release