Cldr v1.5.1 Cldr.Locale View Source
Functions to parse and normalize locale names into a structure
locale represented by a Cldr.LanguageTag
.
CLDR represents localisation data organized into locales, with each locale being identified by a locale name that is formatted according to RFC5646.
In practise, the CLDR data utilizes a simple subset of locale name formats being:
a Language code such as
en
orfr
a Language code and Tertitory code such as
en-GB
a Language code and Script such as
zh-Hant
and in only two cases a Language code, Territory code and Variant such as
ca-ES-VALENCIA
anden-US-POSIX
.
The RFC defines a language tag as:
A language tag is composed from a sequence of one or more “subtags”, each of which refines or narrows the range of language identified by the overall tag. Subtags, in turn, are a sequence of alphanumeric characters (letters and digits), distinguished and separated from other subtags in a tag by a hyphen (“-“, [Unicode] U+002D)
Therefore Cldr
uses the hyphen (“-“, [Unicode] U+002D) as the subtag
separator. On certain platforms, including POSIX platforms, the
subtag separator is a “_” (underscore) rather than a “-“ (hyphen). Where
appropriate, Cldr
will transliterate any underscore into a hyphen before
parsing or processing.
Locale name validity
When validating a locale name, Cldr
will attempt to match the requested
locale name to a configured locale. Therefore Cldr.Locale.new/1
may
return an {:ok, language_tag}
tuple even when the locale returned does
not exactly match the requested locale name. For example, the following
attempts to create a locale matching the non-existent “english as spoken
in Spain” local name. Here Cldr
will match to the nearest configured
locale, which in this case will be “en”.
iex> Cldr.Locale.new("en-ES")
{:ok, %Cldr.LanguageTag{
canonical_locale_name: "en-Latn-ES",
cldr_locale_name: "en",
extensions: %{},
gettext_locale_name: "en",
language: "en",
locale: %{},
private_use: [],
rbnf_locale_name: "en",
requested_locale_name: "en-ES",
script: "Latn",
territory: "ES",
transform: %{},
variant: nil
}}
Matching locales to requested locale names
When attempting to match the requested locale name to a configured
locale, Cldr
attempt to match against a set of reductions in the
following order and will return the first match:
- language, script, territory, variant
- language, territory, variant
- language, script, variant
- language, variant
- language, script, territory
- language, territory
- language, script
- language
- requested locale name
- nil
Therefore matching is tolerant of a request for unknown scripts, territories and variants. Only the requested language is a requirement to be matched to a configured locale.
Substitutions for Obsolete and Deprecated locale names
CLDR provides data to help manage the transition from obsolete or deprecated locale names to current names. For example, the following requests the locale name “mo” which is the deprecated code for “Moldovian”. The replacement code is “ro” (Romanian).
iex> Cldr.Locale.new("mo")
{:ok,
%Cldr.LanguageTag{canonical_locale_name: "ro-Latn-MD",
cldr_locale_name: "ro-MD", extensions: %{}, language: "ro",
locale: %{}, private_use: [], rbnf_locale_name: "ro",
requested_locale_name: "mo", script: "Latn", territory: "MD",
transform: %{}, variant: nil}}
Likely subtags
CLDR also provides data to indetify the most likely subtags for a requested locale name. This data is based on the default content data, the population data, and the the suppress-script data in [BCP47]. It is heuristically derived, and may change over time. For example, when requesting the locale “en”, the following is returned:
iex> Cldr.Locale.new("en")
{:ok, %Cldr.LanguageTag{
canonical_locale_name: "en-Latn-US",
cldr_locale_name: "en",
extensions: %{},
gettext_locale_name: "en",
language: "en",
locale: %{},
private_use: [],
rbnf_locale_name: "en",
requested_locale_name: "en",
script: "Latn",
territory: "US",
transform: %{},
variant: nil
}}
Which shows that a the likely subtag for the script is “Latn” and the likely territory is “US”.
Using the example for Substitutions above, we can see the result of combining substitutions and likely subtags for locale name “mo” returns the current language code of “ro” as well as the likely territory code of “MD” (Moldova).
Unknown territory codes
Whilst Cldr
is tolerant of invalid territory codes, it is also important
that such invalid codes not shadow the potential replacement of deprecated
codes nor the insertion of likely subtags. Therefore invalid territory
codes are ignored during this process. For example requesting a locale
name “en-XX” which requests the invalid territory “XX”, the following
will be returned:
iex> Cldr.Locale.new("en-XX")
{:ok, %Cldr.LanguageTag{
canonical_locale_name: "en-Latn-US",
cldr_locale_name: "en",
extensions: %{},
gettext_locale_name: "en",
language: "en",
locale: %{},
private_use: [],
rbnf_locale_name: "en",
requested_locale_name: "en",
script: "Latn",
territory: "US",
transform: %{},
variant: nil
}}
Link to this section Summary
Types
The name of a locale in a string format
Functions
Replace empty subtags within a Cldr.LanguageTag.t
with the most likely
subtag
Returns an error tuple for an invalid locale alias
Return a map of the known aliases for Language, Script and Territory
Return a map of the aliases for a given alias key and type
Parses a locale name and returns a Cldr.LanguageTag
struct
that represents a locale
Parses a locale name and returns a Cldr.LanguageTag
struct
that represents a locale or raises on error
Returns an error tuple for an invalid gettext locale
Returns the map of likely subtags
Returns the likely substags, as a Cldr.LanguageTag
,
for a given locale name
Returns an error tuple for an invalid locale
Return a locale name from a Cldr.LanguageTag
Return a locale name by combining language, script, territory and variant parameters
Normalize the casing of a locale name
Substitute deprectated subtags with a Cldr.LanguageTag
with their
non-deprecated alternatives
Link to this section Types
The name of a locale in a string format
Link to this section Functions
Replace empty subtags within a Cldr.LanguageTag.t
with the most likely
subtag.
Options
language_tag
is any language tag returned byCldr.Locale.new/1
A subtag is called empty if it has a missing script or territory subtag, or it is
a base language subtag with the value und
. In the description below,
a subscript on a subtag x indicates which tag it is from: xs is in the
source, xm is in a match, and xr is in the final result.
Lookup
Lookup each of the following in order, and stops on the first match:
- languages-scripts-regions
- languages-regions
- languages-scripts
- languages
- und-scripts
Returns
If there is no match,either return
- an error value, or
- the match for
und
Otherwise there is a match = languagem-scriptm-regionm
Let xr = xs if xs is not empty, and xm otherwise.
Return the language tag composed of languager-scriptr-regionr + variants + extensions .
Example
iex> Cldr.Locale.add_likely_subtags Cldr.LanguageTag.parse!("zh-SG")
%Cldr.LanguageTag{
canonical_locale_name: nil,
cldr_locale_name: nil,
extensions: %{},
language: "zh",
gettext_locale_name: nil,
locale: %{},
private_use: [],
rbnf_locale_name: nil,
requested_locale_name: "zh-sg",
script: "Hans",
territory: "SG",
transform: %{},
variant: nil
}
alias_error(Locale.locale_name() | Cldr.LanguageTag.t(), String.t()) :: {Cldr.UnknownLocaleError, String.t()}
Returns an error tuple for an invalid locale alias.
Options
locale_name
is any locale name returned byCldr.known_locale_names/0
Return a map of the known aliases for Language, Script and Territory
Return a map of the aliases for a given alias key and type
Options
type
is one of[:language, :region, :script, :variant, :zone]
key
is the substitution key (a language, region, script, variant or zone)
canonical_language_tag(locale_name() | Cldr.LanguageTag.t()) :: {:ok, Cldr.LanguageTag.t()} | {:error, {Cldr.InvalidLanguageTag, String.t()}}
Parses a locale name and returns a Cldr.LanguageTag
struct
that represents a locale.
Options
language_tag
is any language tag returned byCldr.Locale.new/1
locale
is any valid locale name returned byCldr.known_locale_names/0
or aCldr.LanguageTag
struct
Returns
{:ok, language_tag}
or{:eror, reason}
Method
The language tag is parsed in accordance with RFC5646
Any language, script or region aliases are replaced. This will replace any obsolete elements with current versions
If a territory or script is not specified, a default is provided using the CLDR information returned by
Cldr.Locale.likely_subtags/1
A
Cldr
locale name is selected that is the nearest fit to the requested locale.
Example
iex> Cldr.Locale.canonical_language_tag "en"
{
:ok,
%Cldr.LanguageTag{
canonical_locale_name: "en-Latn-US",
cldr_locale_name: "en",
extensions: %{},
gettext_locale_name: "en",
language: "en",
locale: %{},
private_use: [],
rbnf_locale_name: "en",
requested_locale_name: "en",
script: "Latn",
territory: "US",
transform: %{},
variant: nil
}
}
canonical_language_tag!(locale_name() | Cldr.LanguageTag.t()) :: Cldr.LanguageTag.t() | none()
Parses a locale name and returns a Cldr.LanguageTag
struct
that represents a locale or raises on error.
Options
language_tag
is any language tag returned byCldr.Locale.new/1
locale
is any valid locale name returned byCldr.known_locale_names/0
or aCldr.LanguageTag
struct
See Cldr.Locale.canonical_language_tag/1
for more information.
gettext_locale_error(Locale.locale_name() | Cldr.LanguageTag.t()) :: {Cldr.UnknownLocaleError, String.t()}
Returns an error tuple for an invalid gettext locale.
Options
locale_name
is any locale name returned byCldr.known_gettext_locale_names/0
Returns
{:error, {Cldr.UnknownLocaleError, message}}
Examples
iex> Cldr.Locale.gettext_locale_error :invalid
{Cldr.UnknownLocaleError, "The gettext locale :invalid is not known."}
Returns the map of likely subtags.
Note that not all locales are guaranteed to have likely subtags.
Example
Cldr.Locale.likely_subtags
%{
"bez" => %Cldr.LanguageTag{
canonical_locale_name: nil,
cldr_locale_name: nil,
extensions: %{},
language: "bez",
locale: %{},
private_use: [],
rbnf_locale_name: nil,
requested_locale_name: nil,
script: "Latn",
territory: "TZ",
transform: %{},
variant: nil
},
"fuf" => %Cldr.LanguageTag{
canonical_locale_name: nil,
cldr_locale_name: nil,
extensions: %{},
language: "fuf",
locale: %{},
private_use: [],
rbnf_locale_name: nil,
requested_locale_name: nil,
script: "Latn",
territory: "GN",
transform: %{},
variant: nil
},
...
likely_subtags(locale_name()) :: Cldr.LanguageTag.t()
Returns the likely substags, as a Cldr.LanguageTag
,
for a given locale name.
Options
locale
is any valid locale name returned byCldr.known_locale_names/0
or aCldr.LanguageTag
struct
Examples
iex> Cldr.Locale.likely_subtags "en"
%Cldr.LanguageTag{
canonical_locale_name: nil,
cldr_locale_name: nil,
extensions: %{},
gettext_locale_name: nil,
language: "en",
locale: %{},
private_use: [],
rbnf_locale_name: nil,
requested_locale_name: "en-latn-us",
script: "Latn",
territory: "US",
transform: %{},
variant: nil
}
locale_error(Locale.locale_name() | Cldr.LanguageTag.t()) :: {Cldr.UnknownLocaleError, String.t()}
Returns an error tuple for an invalid locale.
Options
locale_name
is any locale name returned byCldr.known_locale_names/0
Returns
{:error, {Cldr.UnknownLocaleError, message}}
Examples
iex> Cldr.Locale.locale_error :invalid
{Cldr.UnknownLocaleError, "The locale :invalid is not known."}
locale_name_from(Cldr.LanguageTag.t()) :: Locale.locale_namne()
Return a locale name from a Cldr.LanguageTag
Options
locale
is any valid locale name returned byCldr.known_locale_names/0
or aCldr.LanguageTag
struct
Example
iex> Cldr.Locale.locale_name_from Cldr.Locale.new!("en")
"en-Latn-US"
Return a locale name by combining language, script, territory and variant parameters
Options
language
,script
,territory
andvariant
are string representations, ornil
, of the language subtags
Returns
- The locale name constructed from the non-nil arguments joined by a “-“
Example
iex> Cldr.Locale.locale_name_from("en", "Latn", "001", nil)
"en-Latn-001"
normalize_locale_name(locale_name()) :: locale_name()
Normalize the casing of a locale name.
Options
locale
is any valid locale name returned byCldr.known_locale_names/0
or aCldr.LanguageTag
struct
Returns
- The normalized locale name as a
String.t
Method
Locale names are case insensitive but certain common casing is followed in practise:
- lower case for a language
- capital case for a script
- upper case for a region/territory
Note this function is intended to support only the CLDR locale names which have a format that is a subset of the full langauge tag specification.
For proper parsing of local names and language tags, see
Cldr.Locale.canonical_language_tag/1
Examples
iex> Cldr.Locale.normalize_locale_name "zh_hant"
"zh-Hant"
iex> Cldr.Locale.normalize_locale_name "en_us"
"en-US"
iex> Cldr.Locale.normalize_locale_name "EN"
"en"
iex> Cldr.Locale.normalize_locale_name "ca_es_valencia"
"ca-ES-VALENCIA"
set_gettext_locale_name(Cldr.LanguageTag.t()) :: Cldr.LanguageTag.t()
Substitute deprectated subtags with a Cldr.LanguageTag
with their
non-deprecated alternatives.
Options
language_tag
is any language tag returned byCldr.Locale.new/1
Method
Replace any deprecated subtags with their canonical values using the alias data. Use the first value in the replacement list, if it exists. Language tag replacements may have multiple parts, such as
sh
➞sr_Latn
ormo
➞ro_MD
. In such a case, the original script and/or region/territory are retained if there is one. Thussh_Arab_AQ
➞sr_Arab_AQ
, notsr_Latn_AQ
.Remove the script code ‘Zzzz’ and the territory code ‘ZZ’ if they occur.
Get the components of the cleaned-up source tag (languages, scripts, and regions/territories), plus any variants and extensions.
Example
iex> Cldr.Locale.substitute_aliases Cldr.LanguageTag.Parser.parse!("mo")
%Cldr.LanguageTag{
canonical_locale_name: nil,
cldr_locale_name: nil,
extensions: %{},
gettext_locale_name: nil,
language: "ro",
locale: %{},
private_use: [],
rbnf_locale_name: nil,
requested_locale_name: "mo",
script: nil,
territory: "MD",
transform: %{},
variant: nil
}