Unicode.GeneralCategory.Derived (Unicode v1.12.0) View Source
For certain operations and transformations (especially in Unicode Sets) there is an expectation that certain derived general categories exists even though they are not defined in the unicode character database.
These categories are:
:any
which is the full unicode character range0x0..0x10ffff
:assigned
which is the set of codepoints that are assigned and is therefore equivalent to[:any]-[:Cn]
. In fact that is exactly how it is calculated using unicode_set and the results are copied here so that there is no mutual dependency.:ascii
which is the range for the US ASCII character set of0x0..0x7f
In addition there are derived categories not part of the Unicode specification that support additional use cases. These include:
Categories related to recognising quotation marks. See the module
Unicode.Category.QuoteMarks
.:printable
which implements the same semantics asString.printable?/1
. This is a very broad definition of printable characters.:graph
which includes characters from the[^\p{space}\p{gc=Control}\p{gc=Surrogate}\p{gc=Unassigned}]
set defined by Unicode Regular Expressions.
Link to this section Summary
Functions
Returns a map of the aliases for the derived General Categories
Returns a map of the derived General Categories