Computes implicit collation elements for codepoints not in the DUCET/CLDR allkeys table.
The UCA defines an algorithm for computing implicit weights for:
CJK Unified Ideographs (Han characters).
Hangul syllables (decomposed algorithmically).
Unassigned codepoints.
See UTS #10 Section 10.1 for the implicit weight computation.
Summary
Functions
Compute implicit collation elements for a codepoint not in the allkeys table.
Decompose a Hangul syllable into its constituent jamo codepoints.
Check if a codepoint is a Hangul syllable.
Check if a codepoint is a CJK Unified Ideograph.
Functions
@spec compute(non_neg_integer()) :: {:hangul_decompose, [non_neg_integer()]} | [Localize.Collation.Element.t()]
Compute implicit collation elements for a codepoint not in the allkeys table.
Arguments
cp- an integer codepoint.
Returns
{:hangul_decompose, jamo}- for Hangul syllables.[element, element]- two implicit CEs for CJK or unassigned codepoints.
Examples
iex> [ce1, ce2] = Localize.Collation.ImplicitWeights.compute(0x4E00)
iex> Localize.Collation.Element.primary(ce1) >= 0xFB40
true
iex> Localize.Collation.Element.secondary(ce2)
0
@spec decompose_hangul_to_jamo(non_neg_integer()) :: [non_neg_integer()]
Decompose a Hangul syllable into its constituent jamo codepoints.
Arguments
cp- an integer codepoint for a Hangul syllable (U+AC00..U+D7A3).
Returns
A list of 2 or 3 jamo codepoints: [lead, vowel] or [lead, vowel, trail].
Examples
iex> Localize.Collation.ImplicitWeights.decompose_hangul_to_jamo(0xAC00)
[0x1100, 0x1161]
iex> Localize.Collation.ImplicitWeights.decompose_hangul_to_jamo(0xAC01)
[0x1100, 0x1161, 0x11A8]
@spec hangul_syllable?(non_neg_integer()) :: boolean()
Check if a codepoint is a Hangul syllable.
Arguments
cp- an integer codepoint.
Returns
trueif the codepoint is a Hangul syllable.falseotherwise.
Examples
iex> Localize.Collation.ImplicitWeights.hangul_syllable?(0xAC00)
true
iex> Localize.Collation.ImplicitWeights.hangul_syllable?(0x0041)
false
@spec unified_ideograph?(non_neg_integer()) :: boolean()
Check if a codepoint is a CJK Unified Ideograph.
Arguments
cp- an integer codepoint.
Returns
trueif the codepoint is a CJK Unified Ideograph.falseotherwise.
Examples
iex> Localize.Collation.ImplicitWeights.unified_ideograph?(0x4E00)
true
iex> Localize.Collation.ImplicitWeights.unified_ideograph?(0x0041)
false