View Source Cldr.Number.Parser (Cldr Numbers v2.32.0)
Functions for parsing numbers and currencies from a string.
Summary
Functions
Find a substring at the beginning and/or end of a string, and replace it.
Parse a string in a locale-aware manner and return a number.
Removes any whitespace strings from between tokens in a list.
Maps a list of terms (usually strings and atoms) calling a resolver function that operates on each binary term.
Resolve curencies from strings within a list.
Resolve a currency from the beginning and/or the end of a string
Resolve and tokenize percent or permille from the beginning and/or the end of a string
Resolve and tokenize percent and permille sybols from strings within a list.
Scans a string in a locale-aware manner and returns a list of strings and numbers.
Types
@type per() :: :percent | :permille
Functions
@spec find_and_replace(%{required(binary()) => term()}, binary(), float() | nil) :: {:ok, list()} | {:error, {module(), binary()}}
Find a substring at the beginning and/or end of a string, and replace it.
Ignore any whitespace found at the start or end of the string when looking for a match. A match is considered only if there is no alphabetic character adjacent to the match.
When multiple matches are found, the longest match is replaced.
Arguments
string_map
is a map where the keys are the strings to be matched and the values are the replacement.string
is the string in which the find and replace operation takes place.fuzzy
is floating point number between 0.0 and 1.0 that is used to implement a fuzzy match usingString.jaro_distance/2
. The default isnil
which means the match is exact at the beginning and/or the end of thestring
.
Returns
{:ok, list}
where list isstring
broken into the replacement(s) and the remainder after find and replace. Or{:error, {exception, reason}}
will be returned if thefuzzy
parameter is invalid or if no search was found and no replacement made. In the later case,exception
will beCldr.Number.ParseError
.
Examples
iex> Cldr.Number.Parser.find_and_replace(%{"this" => "that"}, "This is a string")
{:ok, ["that", " is a string"]}
iex> Cldr.Number.Parser.find_and_replace(%{"string" => "term"}, "This is a string")
{:ok, ["This is a ", "term"]}
iex> Cldr.Number.Parser.find_and_replace(%{"string" => "term", "this" => "that"}, "This is a string")
{:ok, ["that", " is a ", "term"]}
iex> Cldr.Number.Parser.find_and_replace(%{"unknown" => "term"}, "This is a string")
{:error, {Cldr.Number.ParseError, "No match was found"}}
@spec parse(String.t(), Keyword.t()) :: {:ok, integer() | float() | Decimal.t()} | {:error, {module(), String.t()}}
Parse a string in a locale-aware manner and return a number.
Arguments
string
is anyt:String
options
is a keyword list of options
Options
:number
is one of:integer
,:float
,:decimal
ornil
. The default isnil
meaning that the type auto-detected as either aninteger
or afloat
.:backend
is any module that includesuse Cldr
and is therefore a CLDR backend module. The default isCldr.default_backend/0
.:locale
is any locale returned byCldr.known_locale_names/1
or aCldr.LanguageTag.t
. The default isoptions[:backend].get_locale/1
.
Returns
A number of the requested or default type or
{:error, {exception, message}}
if no number could be determined
Notes
This function parses a string to return a number but in a locale-aware manner. It will normalise digits, grouping characters and decimal separators.
It will transliterate digits that are in the
number system of the specific locale. For example, if
the locale is th
(Thailand), then Thai digits are
transliterated to the Latin script before parsing.
Some number systems do not have decimal digits and in this case an error will be returned, rather than continue parsing and return misleading results.
It also caters for different forms of
the +
and -
symbols that appear in Unicode and
strips any _
characters that might be used for
formatting in a string.
It then parses the number using the Elixir standard library functions.
If the option :number
is used and the parsed number
cannot be coerced to this type without losing precision
then an error is returned.
Examples
iex> Cldr.Number.Parser.parse("+1.000,34", locale: "de")
{:ok, 1000.34}
iex> Cldr.Number.Parser.parse("-1_000_000.34")
{:ok, -1000000.34}
iex> Cldr.Number.Parser.parse("1.000", locale: "de", number: :integer)
{:ok, 1000}
iex> Cldr.Number.Parser.parse "١٢٣٤٥", locale: "ar"
{:ok, 12345}
# 1_000.34 cannot be coerced into an integer
# without precision loss so an error is returned.
iex> Cldr.Number.Parser.parse("+1.000,34", locale: "de", number: :integer)
{:error,
{Cldr.Number.ParseError,
"The string \"+1.000,34\" could not be parsed as a number"}}
iex> Cldr.Number.Parser.parse "一万二千三百四十五", locale: "ja-u-nu-jpan"
{:error, {Cldr.UnknownNumberSystemError, "The number system :jpan does not have digits"}}
Removes any whitespace strings from between tokens in a list.
Tokens are numbers or atoms.
Maps a list of terms (usually strings and atoms) calling a resolver function that operates on each binary term.
If the resolver function returns {:error, term}
then no change is made to the term, otherwise
the return value of the resolver replaces the
original term.
Arguments
list
is a list of terms. Typically this is the result of callingCldr.Number.Parser.scan/1
.resolver
is a function that takes two arguments. The first is one of the terms in thelist
. The second isoptions
.options
is a keyword list of options that is passed to the resolver function.
Note
- The resolver is called only on binary elements of the list.
Returns
list
as modified through the application of the resolver function on each binary term.
Examples
See Cldr.Number.Parser.resolve_currencies/2
and
Cldr.Number.Parser.resolve_pers/2
which both
use this function.
@spec resolve_currencies([String.t(), ...], Keyword.t()) :: [ Cldr.Currency.code() | String.t() ]
Resolve curencies from strings within a list.
Currencies can be identified at the beginning and/or the end of a string.
Arguments
list
is any list in which currency names and symbols are expectedoptions
is a keyword list of options
Options
:backend
is any module() that includesuse Cldr
and therefore is aCldr
backend module(). The default isCldr.default_backend!/0
:locale
is any valid locale returned byCldr.known_locale_names/1
or at:Cldr.LanguageTag
struct returned byCldr.Locale.new!/2
The default isoptions[:backend].get_locale()
:only
is anatom
or list ofatoms
representing the currencies or currency types to be considered for a match. The equates to a list of acceptable currencies for parsing. See the notes below for currency types.:except
is anatom
or list ofatoms
representing the currencies or currency types to be not considered for a match. This equates to a list of unacceptable currencies for parsing. See the notes below for currency types.:fuzzy
is a float greater than0.0
and less than or equal to1.0
which is used as input toString.jaro_distance/2
to determine is the provided currency string is close enough to a known currency string for it to identify definitively a currency code. It is recommended to use numbers greater than0.8
in order to reduce false positives.
Returns
An ISO4217 currency code as an atom or
{:error, {exception, message}}
Notes
The :only
and :except
options accept a list of
currency codes and/or currency types. The following
types are recognised.
If both :only
and :except
are specified,
the :except
entries take priority - that means
any entries in :except
are removed from the :only
entries.
:all
, the default, considers all currencies:current
considers those currencies that have a:to
date of nil and which also is a known ISO4217 currency:historic
is the opposite of:current
:tender
considers currencies that are legal tender:unannotated
considers currencies that don't have "(some string)" in their names. These are usually financial instruments.
Examples
iex> Cldr.Number.Parser.scan("100 US dollars")
...> |> Cldr.Number.Parser.resolve_currencies
[100, :USD]
iex> Cldr.Number.Parser.scan("100 eurosports")
...> |> Cldr.Number.Parser.resolve_currencies(fuzzy: 0.8)
[100, :EUR]
iex> Cldr.Number.Parser.scan("100 dollars des États-Unis")
...> |> Cldr.Number.Parser.resolve_currencies(locale: "fr")
[100, :USD]
@spec resolve_currency(String.t(), Keyword.t()) :: Cldr.Currency.code() | [Cldr.Currency.code() | String.t()] | {:error, {module(), String.t()}}
Resolve a currency from the beginning and/or the end of a string
Arguments
list
is any list in which currency names and symbols are expectedoptions
is a keyword list of options
Options
:backend
is any module() that includesuse Cldr
and therefore is aCldr
backend module(). The default isCldr.default_backend!/0
:locale
is any valid locale returned byCldr.known_locale_names/1
or aCldr.LanguageTag
struct returned byCldr.Locale.new!/2
The default isoptions[:backend].get_locale()
:only
is anatom
or list ofatoms
representing the currencies or currency types to be considered for a match. The equates to a list of acceptable currencies for parsing. See the notes below for currency types.:except
is anatom
or list ofatoms
representing the currencies or currency types to be not considered for a match. This equates to a list of unacceptable currencies for parsing. See the notes below for currency types.:fuzzy
is a float greater than0.0
and less than or equal to1.0
which is used as input toString.jaro_distance/2
to determine is the provided currency string is close enough to a known currency string for it to identify definitively a currency code. It is recommended to use numbers greater than0.8
in order to reduce false positives.
Returns
An ISO417 currency code as an atom or
{:error, {exception, message}}
Notes
The :only
and :except
options accept a list of
currency codes and/or currency types. The following
types are recognised.
If both :only
and :except
are specified,
the :except
entries take priority - that means
any entries in :except
are removed from the :only
entries.
:all
, the default, considers all currencies:current
considers those currencies that have a:to
date of nil and which also is a known ISO4217 currency:historic
is the opposite of:current
:tender
considers currencies that are legal tender:unannotated
considers currencies that don't have "(some string)" in their names. These are usually financial instruments.
Examples
iex> Cldr.Number.Parser.resolve_currency("US dollars")
[:USD]
iex> Cldr.Number.Parser.resolve_currency("100 eurosports", fuzzy: 0.75)
[:EUR]
iex> Cldr.Number.Parser.resolve_currency("dollars des États-Unis", locale: "fr")
[:USD]
iex> Cldr.Number.Parser.resolve_currency("not a known currency", locale: "fr")
{:error,
{Cldr.UnknownCurrencyError,
"The currency \"not a known currency\" is unknown or not supported"}}
@spec resolve_per(String.t(), Keyword.t()) :: per() | [per() | String.t()] | {:error, {module(), String.t()}}
Resolve and tokenize percent or permille from the beginning and/or the end of a string
Arguments
list
is any list in which percent and permille symbols are expectedoptions
is a keyword list of options
Options
:backend
is any module() that includesuse Cldr
and therefore is aCldr
backend module(). The default isCldr.default_backend!/0
:locale
is any valid locale returned byCldr.known_locale_names/1
or aCldr.LanguageTag
struct returned byCldr.Locale.new!/2
The default isoptions[:backend].get_locale()
Returns
An
:percent
orpermille
or{:error, {exception, message}}
Examples
iex> Cldr.Number.Parser.resolve_per "11%"
["11", :percent]
iex> Cldr.Number.Parser.resolve_per "% of linguists"
[:percent, " of linguists"]
iex> Cldr.Number.Parser.resolve_per "% of linguists %"
[:percent, " of linguists ", :percent]
Resolve and tokenize percent and permille sybols from strings within a list.
Percent and permille symbols can be identified at the beginning and/or the end of a string.
Arguments
list
is any list in which percent and permille symbols are expectedoptions
is a keyword list of options
Options
:backend
is any module() that includesuse Cldr
and therefore is aCldr
backend module(). The default isCldr.default_backend!/0
:locale
is any valid locale returned byCldr.known_locale_names/1
or at:Cldr.LanguageTag
struct returned byCldr.Locale.new!/2
The default isoptions[:backend].get_locale()
Examples
iex> Cldr.Number.Parser.scan("100%")
...> |> Cldr.Number.Parser.resolve_pers()
[100, :percent]
@spec scan(String.t(), Keyword.t()) :: [String.t() | integer() | float() | Decimal.t()] | {:error, {module(), String.t()}}
Scans a string in a locale-aware manner and returns a list of strings and numbers.
Arguments
string
is anyString.t
options
is a keyword list of options
Options
:number
is one of:integer
,:float
,:decimal
ornil
. The default isnil
meaning that the type auto-detected as either aninteger
or afloat
.:backend
is any module that includesuse Cldr
and is therefore a CLDR backend module. The default isCldr.default_backend!/0
.:locale
is any locale returned byCldr.known_locale_names/1
or at:Cldr.LanguageTag
. The default isoptions[:backend].get_locale/1
.
Returns
- A list of strings and numbers
Notes
Number parsing is performed by Cldr.Number.Parser.parse/2
and any options provided are passed to that function.
Examples
iex> Cldr.Number.Parser.scan("£1_000_000.34")
["£", 1000000.34]
iex> Cldr.Number.Parser.scan("I want £1_000_000 dollars")
["I want £", 1000000, " dollars"]
iex> Cldr.Number.Parser.scan("The prize is 23")
["The prize is ", 23]
iex> Cldr.Number.Parser.scan("The lottery number is 23 for the next draw")
["The lottery number is ", 23, " for the next draw"]
iex> Cldr.Number.Parser.scan("The loss is -1.000 euros", locale: "de", number: :integer)
["The loss is ", -1000, " euros"]
iex> Cldr.Number.Parser.scan "1kg"
[1, "kg"]
iex> Cldr.Number.Parser.scan "A number is the arab script ١٢٣٤٥", locale: "ar"
["A number is the arab script ", 12345]