Unicode.Set.Operation (Unicode Set v1.5.0)
View SourceFunctions to operate on Unicode sets:
- Intersection
- Difference
- Union
- Inversion
Summary
Functions
Combines all the ranges into a single list
Compact overlapping and adjacent ranges
Returns the complement (inverse) of a set.
Removes one list of 2-tuples representing Unicode codepoints from another.
Expand takes a reduced AST and expands it into a single list of codepoint tuples.
Expand string ranges like {ab}-{cd}
Returns a boolean indicating whether the given AST includes set operations intersection or difference.
Returns the intersection of two lists of 2-tuples representing codepoint ranges.
Reduces all sets, properties and ranges to a list of 2-tuples expressing a range of codepoints.
Returns the difference of two lists of 2-tuples representing codepoint ranges.
Prewalks the expanded AST from a parsed Unicode Set invoking a function on each codepoint range in the set.
Merges two lists of 2-tuples representing ranges of codepoints. The result is a single list of 2-tuple codepoint ranges that includes all codepoint from the two lists.
Functions
Combines all the ranges into a single list
This function is called iff the Unicode
Sets are formed by unions only. If
the set operations of intersection or
difference are present then the ranges
will need to be expanded via expand/1
.
Compact overlapping and adjacent ranges
Returns the complement (inverse) of a set.
Removes one list of 2-tuples representing Unicode codepoints from another.
Returns the first list of codepoint ranges minus the codepoints in the second list.
Expand takes a reduced AST and expands it into a single list of codepoint tuples.
Expand string ranges like {ab}-{cd}
Returns a boolean indicating whether the given AST includes set operations intersection or difference.
When these operations exist then all ranges - including
^
ranges needs to be expanded. If there are no
intersections or differences then the ^
ranges can
be directly translated to guard clauses or a list of
elixir ranges.
Returns the intersection of two lists of 2-tuples representing codepoint ranges.
The result is a single list of codepoint ranges that represents the common codepoints in the two lists.
Reduces all sets, properties and ranges to a list of 2-tuples expressing a range of codepoints.
It can return one of two forms
[{:in, [tuple_list]}]
for an inclusion list
[{:not_in, [tuple_list]}]
for an exclusion list
or a combination of both.
Attempts are made to preserve :not_in
clauses
as long as possible since many uses, like regexes
and nimble_parsec
can consume :not_in
style
ranges.
When only single character classes are presented,
or several classes which are unions
, :not_in
can be preserved.
When intersections and differences are required, the rnages must be both reduced and expanded in order for this set operations to complete.
Returns the difference of two lists of 2-tuples representing codepoint ranges.
The result is a single list of codepoint ranges that represents the codepoints that are in either of the two lists but not both.
Prewalks the expanded AST from a parsed Unicode Set invoking a function on each codepoint range in the set.
Merges two lists of 2-tuples representing ranges of codepoints. The result is a single list of 2-tuple codepoint ranges that includes all codepoint from the two lists.