View Source Unicode.String.Case.Mapping.Greek (Unicode String v1.4.1)
Implements the special upper casing rules for for the Greek language.
Summary
Functions
This implementation currently implements the el-Upper
transform
from CLDR.
Functions
This implementation currently implements the el-Upper
transform
from CLDR.
CLDR algorithm
According to CLDR all accents on all characters are are omitted when upcasing.
Remove 0301 following Greek, with possible intervening 0308 marks. ::NFD(); For uppercasing (not titlecasing!) remove all greek accents from greek letters. This is done in two groups, to account for canonical ordering. [:Greek:] [^[:ccc=Not_Reordered:][:ccc=Above:]]? { [̓̔́̀̆͂̈̄] → ; [:Greek:] [^[:ccc=Not_Reordered:][:ccc=Iota_Subscript:]]? { ͅ → ; ::NFC();
That transform basically says remove all accents except a subscripted iota. It doesn't handle dipthongs correctly.
Mozilla algorithm
Mozilla has a thread on a bug report that:
Greek accented letters should be converted to the respective non-accented uppercase letters. The required conversions are the following (in Unicode):
ά -> Α έ -> Ε ή -> Η ί -> Ι ΐ -> Ϊ ό -> Ο ύ -> Υ ΰ -> Ϋ ώ -> Ω
Also diphthongs (two-vowel constructs) should be converted as follows, when the first vowel is accented:
άι -> ΑΪ έι -> ΕΪ όι -> ΟΪ ύι -> ΥΪ άυ -> ΑΫ έυ -> ΕΫ ήυ -> ΗΫ όυ -> ΟΫ
That thread seems to align with current-day Mozilla which says the rules are:
In Greek (el), vowels lose their accent when the whole word is in uppercase (ά/Α), except for the disjunctive eta (ή/Ή). Also, diphthongs with an accent on the first vowel lose the accent and gain a diaeresis on the second vowel (άι/ΑΪ).