Unicode.WordBreak (Unicode v1.21.0)

View Source

Functions to introspect Unicode word breaks for string and codepoints.

Summary

Functions

Returns a map of aliases for Unicode word breaks.

Returns the count of the number of characters for a given word_break.

Returns the Unicode ranges for a given word break as a list of ranges as 2-tuples.

Returns the Unicode ranges for a given word break as a list of ranges as 2-tuples.

Returns a list of known Unicode word break names.

Returns the word break name(s) for the given binary or codepoint.

Returns the map of Unicode word breaks.

Functions

aliases()

Returns a map of aliases for Unicode word breaks.

An alias is an alternative name for referring to a word break. Aliases are resolved by the fetch/1 and get/1 functions.

count(word_break)

Returns the count of the number of characters for a given word_break.

Example

iex> Unicode.WordBreak.count(:al)
21400

fetch(word_break)

Returns the Unicode ranges for a given word break as a list of ranges as 2-tuples.

Aliases are resolved by this function.

Returns either {:ok, range_list} or :error.

get(word_break)

Returns the Unicode ranges for a given word break as a list of ranges as 2-tuples.

Aliases are resolved by this function.

Returns either range_list or nil.

known_word_breaks()

Returns a list of known Unicode word break names.

This function does not return the names of any word break aliases.

word_break(string)

Returns the word break name(s) for the given binary or codepoint.

In the case of a codepoint, a single word_break name is returned.

For a binary a list of distinct word break names represented by the lines in the binary is returned.

A value of :xx indicates there is no word break property for a codepoint.

word_breaks()

Returns the map of Unicode word breaks.

The word break name is the map key and a list of codepoint ranges as tuples as the value.