View Source Cldr.Number.Parser (Cldr Numbers v2.32.0)

Functions for parsing numbers and currencies from a string.

Summary

Functions

Find a substring at the beginning and/or end of a string, and replace it.

Parse a string in a locale-aware manner and return a number.

Removes any whitespace strings from between tokens in a list.

Maps a list of terms (usually strings and atoms) calling a resolver function that operates on each binary term.

Resolve curencies from strings within a list.

Resolve a currency from the beginning and/or the end of a string

Resolve and tokenize percent or permille from the beginning and/or the end of a string

Resolve and tokenize percent and permille sybols from strings within a list.

Scans a string in a locale-aware manner and returns a list of strings and numbers.

Types

@type per() :: :percent | :permille

Functions

Link to this function

find_and_replace(string_map, string, fuzzy \\ nil)

View Source (since 2.22.0)
@spec find_and_replace(%{required(binary()) => term()}, binary(), float() | nil) ::
  {:ok, list()} | {:error, {module(), binary()}}

Find a substring at the beginning and/or end of a string, and replace it.

Ignore any whitespace found at the start or end of the string when looking for a match. A match is considered only if there is no alphabetic character adjacent to the match.

When multiple matches are found, the longest match is replaced.

Arguments

  • string_map is a map where the keys are the strings to be matched and the values are the replacement.

  • string is the string in which the find and replace operation takes place.

  • fuzzy is floating point number between 0.0 and 1.0 that is used to implement a fuzzy match using String.jaro_distance/2. The default is nil which means the match is exact at the beginning and/or the end of the string.

Returns

  • {:ok, list} where list is string broken into the replacement(s) and the remainder after find and replace. Or

  • {:error, {exception, reason}} will be returned if the fuzzy parameter is invalid or if no search was found and no replacement made. In the later case, exception will be Cldr.Number.ParseError.

Examples

iex> Cldr.Number.Parser.find_and_replace(%{"this" => "that"}, "This is a string")
{:ok, ["that", " is a string"]}

iex> Cldr.Number.Parser.find_and_replace(%{"string" => "term"}, "This is a string")
{:ok, ["This is a ", "term"]}

iex> Cldr.Number.Parser.find_and_replace(%{"string" => "term", "this" => "that"}, "This is a string")
{:ok, ["that", " is a ", "term"]}

iex> Cldr.Number.Parser.find_and_replace(%{"unknown" => "term"}, "This is a string")
{:error, {Cldr.Number.ParseError, "No match was found"}}
Link to this function

parse(string, options \\ [])

View Source
@spec parse(String.t(), Keyword.t()) ::
  {:ok, integer() | float() | Decimal.t()} | {:error, {module(), String.t()}}

Parse a string in a locale-aware manner and return a number.

Arguments

  • string is any t:String

  • options is a keyword list of options

Options

  • :number is one of :integer, :float, :decimal or nil. The default is nil meaning that the type auto-detected as either an integer or a float.

  • :backend is any module that includes use Cldr and is therefore a CLDR backend module. The default is Cldr.default_backend/0.

  • :locale is any locale returned by Cldr.known_locale_names/1 or a Cldr.LanguageTag.t. The default is options[:backend].get_locale/1.

Returns

  • A number of the requested or default type or

  • {:error, {exception, message}} if no number could be determined

Notes

This function parses a string to return a number but in a locale-aware manner. It will normalise digits, grouping characters and decimal separators.

It will transliterate digits that are in the number system of the specific locale. For example, if the locale is th (Thailand), then Thai digits are transliterated to the Latin script before parsing.

Some number systems do not have decimal digits and in this case an error will be returned, rather than continue parsing and return misleading results.

It also caters for different forms of the + and - symbols that appear in Unicode and strips any _ characters that might be used for formatting in a string.

It then parses the number using the Elixir standard library functions.

If the option :number is used and the parsed number cannot be coerced to this type without losing precision then an error is returned.

Examples

iex> Cldr.Number.Parser.parse("+1.000,34", locale: "de")
{:ok, 1000.34}

iex> Cldr.Number.Parser.parse("-1_000_000.34")
{:ok, -1000000.34}

iex> Cldr.Number.Parser.parse("1.000", locale: "de", number: :integer)
{:ok, 1000}

iex> Cldr.Number.Parser.parse "١٢٣٤٥", locale: "ar"
{:ok, 12345}

# 1_000.34 cannot be coerced into an integer
# without precision loss so an error is returned.
iex> Cldr.Number.Parser.parse("+1.000,34", locale: "de", number: :integer)
{:error,
  {Cldr.Number.ParseError,
   "The string \"+1.000,34\" could not be parsed as a number"}}

iex> Cldr.Number.Parser.parse "一万二千三百四十五", locale: "ja-u-nu-jpan"
{:error, {Cldr.UnknownNumberSystemError, "The number system :jpan does not have digits"}}
Link to this function

remove_whitespace_between_tokens(first)

View Source

Removes any whitespace strings from between tokens in a list.

Tokens are numbers or atoms.

Link to this function

resolve(list, resolver, options)

View Source
@spec resolve([any()], (... -> any()), Keyword.t()) :: list()

Maps a list of terms (usually strings and atoms) calling a resolver function that operates on each binary term.

If the resolver function returns {:error, term} then no change is made to the term, otherwise the return value of the resolver replaces the original term.

Arguments

  • list is a list of terms. Typically this is the result of calling Cldr.Number.Parser.scan/1.

  • resolver is a function that takes two arguments. The first is one of the terms in the list. The second is options.

  • options is a keyword list of options that is passed to the resolver function.

Note

  • The resolver is called only on binary elements of the list.

Returns

  • list as modified through the application of the resolver function on each binary term.

Examples

See Cldr.Number.Parser.resolve_currencies/2 and Cldr.Number.Parser.resolve_pers/2 which both use this function.

Link to this function

resolve_currencies(list, options \\ [])

View Source
@spec resolve_currencies([String.t(), ...], Keyword.t()) :: [
  Cldr.Currency.code() | String.t()
]

Resolve curencies from strings within a list.

Currencies can be identified at the beginning and/or the end of a string.

Arguments

  • list is any list in which currency names and symbols are expected

  • options is a keyword list of options

Options

  • :backend is any module() that includes use Cldr and therefore is a Cldr backend module(). The default is Cldr.default_backend!/0

  • :locale is any valid locale returned by Cldr.known_locale_names/1 or a t:Cldr.LanguageTag struct returned by Cldr.Locale.new!/2 The default is options[:backend].get_locale()

  • :only is an atom or list of atoms representing the currencies or currency types to be considered for a match. The equates to a list of acceptable currencies for parsing. See the notes below for currency types.

  • :except is an atom or list of atoms representing the currencies or currency types to be not considered for a match. This equates to a list of unacceptable currencies for parsing. See the notes below for currency types.

  • :fuzzy is a float greater than 0.0 and less than or equal to 1.0 which is used as input to String.jaro_distance/2 to determine is the provided currency string is close enough to a known currency string for it to identify definitively a currency code. It is recommended to use numbers greater than 0.8 in order to reduce false positives.

Returns

  • An ISO4217 currency code as an atom or

  • {:error, {exception, message}}

Notes

The :only and :except options accept a list of currency codes and/or currency types. The following types are recognised.

If both :only and :except are specified, the :except entries take priority - that means any entries in :except are removed from the :only entries.

  • :all, the default, considers all currencies

  • :current considers those currencies that have a :to date of nil and which also is a known ISO4217 currency

  • :historic is the opposite of :current

  • :tender considers currencies that are legal tender

  • :unannotated considers currencies that don't have "(some string)" in their names. These are usually financial instruments.

Examples

iex> Cldr.Number.Parser.scan("100 US dollars")
...> |> Cldr.Number.Parser.resolve_currencies
[100, :USD]

iex> Cldr.Number.Parser.scan("100 eurosports")
...> |> Cldr.Number.Parser.resolve_currencies(fuzzy: 0.8)
[100, :EUR]

iex> Cldr.Number.Parser.scan("100 dollars des États-Unis")
...> |> Cldr.Number.Parser.resolve_currencies(locale: "fr")
[100, :USD]
Link to this function

resolve_currency(string, options \\ [])

View Source
@spec resolve_currency(String.t(), Keyword.t()) ::
  Cldr.Currency.code()
  | [Cldr.Currency.code() | String.t()]
  | {:error, {module(), String.t()}}

Resolve a currency from the beginning and/or the end of a string

Arguments

  • list is any list in which currency names and symbols are expected

  • options is a keyword list of options

Options

  • :backend is any module() that includes use Cldr and therefore is a Cldr backend module(). The default is Cldr.default_backend!/0

  • :locale is any valid locale returned by Cldr.known_locale_names/1 or a Cldr.LanguageTag struct returned by Cldr.Locale.new!/2 The default is options[:backend].get_locale()

  • :only is an atom or list of atoms representing the currencies or currency types to be considered for a match. The equates to a list of acceptable currencies for parsing. See the notes below for currency types.

  • :except is an atom or list of atoms representing the currencies or currency types to be not considered for a match. This equates to a list of unacceptable currencies for parsing. See the notes below for currency types.

  • :fuzzy is a float greater than 0.0 and less than or equal to 1.0 which is used as input to String.jaro_distance/2 to determine is the provided currency string is close enough to a known currency string for it to identify definitively a currency code. It is recommended to use numbers greater than 0.8 in order to reduce false positives.

Returns

  • An ISO417 currency code as an atom or

  • {:error, {exception, message}}

Notes

The :only and :except options accept a list of currency codes and/or currency types. The following types are recognised.

If both :only and :except are specified, the :except entries take priority - that means any entries in :except are removed from the :only entries.

  • :all, the default, considers all currencies

  • :current considers those currencies that have a :to date of nil and which also is a known ISO4217 currency

  • :historic is the opposite of :current

  • :tender considers currencies that are legal tender

  • :unannotated considers currencies that don't have "(some string)" in their names. These are usually financial instruments.

Examples

iex> Cldr.Number.Parser.resolve_currency("US dollars")
[:USD]

iex> Cldr.Number.Parser.resolve_currency("100 eurosports", fuzzy: 0.75)
[:EUR]

iex> Cldr.Number.Parser.resolve_currency("dollars des États-Unis", locale: "fr")
[:USD]

iex> Cldr.Number.Parser.resolve_currency("not a known currency", locale: "fr")
{:error,
 {Cldr.UnknownCurrencyError,
  "The currency \"not a known currency\" is unknown or not supported"}}
Link to this function

resolve_per(string, options \\ [])

View Source (since 2.21.0)
@spec resolve_per(String.t(), Keyword.t()) ::
  per() | [per() | String.t()] | {:error, {module(), String.t()}}

Resolve and tokenize percent or permille from the beginning and/or the end of a string

Arguments

  • list is any list in which percent and permille symbols are expected

  • options is a keyword list of options

Options

Returns

  • An :percent or permille or

  • {:error, {exception, message}}

Examples

iex> Cldr.Number.Parser.resolve_per "11%"
["11", :percent]

iex> Cldr.Number.Parser.resolve_per "% of linguists"
[:percent, " of linguists"]

iex> Cldr.Number.Parser.resolve_per "% of linguists %"
[:percent, " of linguists ", :percent]
Link to this function

resolve_pers(list, options \\ [])

View Source (since 2.21.0)
@spec resolve_pers([String.t(), ...], Keyword.t()) :: [per() | String.t()]

Resolve and tokenize percent and permille sybols from strings within a list.

Percent and permille symbols can be identified at the beginning and/or the end of a string.

Arguments

  • list is any list in which percent and permille symbols are expected

  • options is a keyword list of options

Options

Examples

iex> Cldr.Number.Parser.scan("100%")
...> |> Cldr.Number.Parser.resolve_pers()
[100, :percent]
Link to this function

scan(string, options \\ [])

View Source
@spec scan(String.t(), Keyword.t()) ::
  [String.t() | integer() | float() | Decimal.t()]
  | {:error, {module(), String.t()}}

Scans a string in a locale-aware manner and returns a list of strings and numbers.

Arguments

  • string is any String.t

  • options is a keyword list of options

Options

  • :number is one of :integer, :float, :decimal or nil. The default is nil meaning that the type auto-detected as either an integer or a float.

  • :backend is any module that includes use Cldr and is therefore a CLDR backend module. The default is Cldr.default_backend!/0.

  • :locale is any locale returned by Cldr.known_locale_names/1 or a t:Cldr.LanguageTag. The default is options[:backend].get_locale/1.

Returns

  • A list of strings and numbers

Notes

Number parsing is performed by Cldr.Number.Parser.parse/2 and any options provided are passed to that function.

Examples

iex> Cldr.Number.Parser.scan("£1_000_000.34")
["£", 1000000.34]

iex> Cldr.Number.Parser.scan("I want £1_000_000 dollars")
["I want £", 1000000, " dollars"]

iex> Cldr.Number.Parser.scan("The prize is 23")
["The prize is ", 23]

iex> Cldr.Number.Parser.scan("The lottery number is 23 for the next draw")
["The lottery number is ", 23, " for the next draw"]

iex> Cldr.Number.Parser.scan("The loss is -1.000 euros", locale: "de", number: :integer)
["The loss is ", -1000, " euros"]

iex> Cldr.Number.Parser.scan "1kg"
[1, "kg"]

iex> Cldr.Number.Parser.scan "A number is the arab script ١٢٣٤٥", locale: "ar"
["A number is the arab script ", 12345]