View Source Unicode.String.Case.Mapping (Unicode String v1.4.1)

The Unicode Case Mapping algorithm defines the process and data to transform text into upper case, lower case or title case.

Since most languages are not bicameral, characters which have no appropriate mapping remain unchanged.

Three case mapping functions are provided as a public API which have their implementations in this module:

Each function operates in a locale-aware manner implementing some basic capabilities:

  • Casing rules for the Turkish dotted capital I and dotless small i.
  • Casing rules for the retention of dots over i for Lithuanian letters with additional accents.
  • Titlecasing of IJ at the start of words in Dutch.
  • Removal of accents when upper casing letters in Greek.

There are other casing rules that are not currently implemented such as:

  • Titlecasing of second or subsequent letters in words in orthographies that include caseless letters such as apostrophes.
  • Uppercasing of U+00DF ß latin small letter sharp s to U+1E9E latin capital letter sharp s.

Examples

# Basic case transformation
iex> Unicode.String.Case.Mapping.upcase("the quick brown fox")
"THE QUICK BROWN FOX"

# Dotted-I in Turkish and Azeri
iex> Unicode.String.Case.Mapping.upcase("Diyarbakır", :tr)
"DİYARBAKIR"

# Upper case in Greek removes diacritics
iex> Unicode.String.Case.Mapping.upcase("Πατάτα, Αέρας, Μυστήριο", :el)
"ΠΑΤΑΤΑ, ΑΕΡΑΣ, ΜΥΣΤΗΡΙΟ"

# Lower case Greek with a final sigma
iex> Unicode.String.Case.Mapping.downcase("ὈΔΥΣΣΕΎΣ", :el)
"ὀδυσσεύς"

# Title case Dutch with leading dipthong
iex> Unicode.String.Case.Mapping.titlecase("ijsselmeer", :nl)
"IJsselmeer"

Summary

Functions

Replace upper case characters with their lower case equivalents.

Apply to Unicode title case algorithm.

Replace lower case characters with their uppercase equivalents.

Functions

Link to this function

downcase(string, language \\ :any)

View Source

Replace upper case characters with their lower case equivalents.

Link to this function

titlecase(string, language \\ :any)

View Source

Apply to Unicode title case algorithm.

Link to this function

upcase(string, language \\ :any)

View Source

Replace lower case characters with their uppercase equivalents.

Lower case characters are replaced with their upper case equivalents. All other characters remain unchanged.

For the Greek language (:el), all accents are removed prior to capitalization as is the normal practise for this language.