# `Cldr.Collation`
[🔗](https://github.com/elixir-cldr/cldr_collation/blob/v1.0.0/lib/cldr/collation.ex#L1)

Implements the Unicode Cldr.Collation Algorithm (UCA) as extended by CLDR.

Cldr.Collation is the general term for the process and function of
determining the sorting order of strings of characters, for example for
lists of strings presented to users, or in databases for sorting and selecting
records.

Cldr.Collation varies by language, by application (some languages use special
phonebook sorting), and other criteria (for example, phonetic vs. visual).

CLDR provides collation data for many languages and styles. The data
supports not only sorting but also language-sensitive searching and grouping
under index headers. All CLDR collations are based on the [UCA] default order,
with common modifications applied in the CLDR root collation, and further
tailored for language and style as needed.

## Basic Usage

    # Compare two strings
    iex> Cldr.Collation.compare("café", "cafe")
    :gt

    # Sort a list of strings
    iex> Cldr.Collation.sort(["café", "cafe", "Cafe"])
    ["cafe", "Cafe", "café"]

    # Generate a sort key
    iex> Cldr.Collation.sort_key("hello")
    <<36, 196, 36, 83, 37, 40, 37, 40, 37, 152, 0, 0, 0, 32, 0, 32, 0, 32, 0, 32, 0,
      32, 0, 0, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2>>

    # With options
    iex> Cldr.Collation.compare("a", "A", strength: :secondary)
    :eq

    # From BCP47 locale
    iex> Cldr.Collation.compare("a", "A", locale: "en-u-ks-level2")
    :eq

## Cldr.Collation Options

All BCP47 -u- extension collation keys are supported. See the
[detailed explanation](collation_options.html) for more information on how
each option affect sort order.

* `strength` - `:primary`, `:secondary`, `:tertiary` (default), `:quaternary`, `:identical`.

* `alternate` - `:non_ignorable` (default), `:shifted`.

* `backwards` - `false` (default), `true` - reverse secondary weights (French).

* `normalization` - `false` (default), `true` - NFD normalize input.

* `case_level` - `false` (default), `true` - insert case-only level.

* `case_first` - `false` (default), `:upper`, `:lower`.

* `numeric` - `false` (default), `true` - numeric string comparison.

* `reorder` - `[]` (default), list of script code atoms.

* `max_variable` - `:punct` (default), `:space`, `:symbol`, `:currency`.

* `ignore_accents` - `true` to ignore accent differences (sets strength to primary).

* `ignore_case` - `true` to ignore case differences (sets strength to secondary).

* `ignore_punctuation` - `true` to ignore punctuation and whitespace (sets alternate to shifted).

* `casing` - `:sensitive`, `:insensitive` (convenience alias, compatible with `ex_cldr_collation`).

* `backend` - `:default` (NIF if available), `:nif`, `:elixir`.

## NIF Backend

An optional NIF backend using ICU4C is available for high-performance collation.
When compiled, it is used automatically for comparisons that only use
ICU-configurable attributes (strength, backwards, alternate, case_first,
case_level, normalization, numeric, reorder). Options requiring locale
tailoring or non-default max_variable use the pure Elixir backend.

To enable the NIF backend, either set the environment variable:

    CLDR_COLLATION_NIF=true mix compile

Or add to your `config.exs` (must be `config.exs`, not `runtime.exs`,
since it is evaluated at compile time):

    config :ex_cldr_collation, :nif, true

Requires ICU system libraries (`libicu` or `icucore` on macOS).

# `compare`

```elixir
@spec compare(String.t(), String.t(), keyword() | Cldr.Collation.Options.t()) ::
  :lt | :eq | :gt
```

Compare two strings using the CLDR collation algorithm.

### Arguments

* `string_a` - the first string to compare.

* `string_b` - the second string to compare.

* `options` - a keyword list of collation options.

### Options

See the [detailed explanation](collation_options.html) for more information on
each option and its impact on sort order.

* `:strength` - comparison level: `:primary`, `:secondary`, `:tertiary` (default), `:quaternary`,
   or `:identical`.

* `:alternate` - variable weight handling: `:non_ignorable` (default) or `:shifted`.

* `:backwards` - reverse secondary weights for French sorting: `false` (default) or `true`.

* `:normalization` - NFD normalize input: `false` (default) or `true`.

* `:case_level` - insert case-only comparison level: `false` (default) or `true`.

* `:case_first` - case ordering: `false` (default), `:upper`, or `:lower`.

* `:numeric` - numeric string comparison: `false` (default) or `true`.

* `:reorder` - list of script code atoms to reorder: `[]` (default).

* `:max_variable` - variable weight boundary: `:punct` (default), `:space`,
  `:symbol`, or `:currency`.

* `:ignore_accents` - `true` to ignore accent differences (sets `strength: :primary`).
  Explicit `:strength` takes precedence.

* `:ignore_case` - `true` to ignore case differences (sets `strength: :secondary`).
  Explicit `:strength` takes precedence.

* `:ignore_punctuation` - `true` to ignore punctuation and whitespace (sets `alternate: :shifted`).
  Explicit `:strength` or `:alternate` take precedence.

* `:casing` - `:sensitive` or `:insensitive` (convenience alias for strength, compatible
  with `ex_cldr_collation`).

* `:locale` - a BCP47 locale string (e.g., `"en-u-ks-level2"`) or a `Cldr.LanguageTag`
  struct (when `ex_cldr` is available). When `ex_cldr` is loaded and a string is provided,
  it is parsed via `Cldr.Locale.canonical_language_tag/2` using the default CLDR backend.

* `:cldr_backend` - a CLDR backend module to use for locale parsing (e.g., `MyApp.Cldr`).
  Only used when `:locale` is a string. Defaults to `Cldr.default_backend!()`.

* `:backend` - `:default` (NIF if available), `:nif` (require NIF), or `:elixir` (pure Elixir).

### Returns

* `:lt` - if `string_a` sorts before `string_b`.

* `:eq` - if `string_a` and `string_b` are equal at the given strength.

* `:gt` - if `string_a` sorts after `string_b`.

### Examples

    iex> Cldr.Collation.compare("cafe", "café")
    :lt

    iex> Cldr.Collation.compare("a", "A", strength: :secondary)
    :eq

    iex> Cldr.Collation.compare("a", "A", casing: :insensitive)
    :eq

# `ensure_loaded`

```elixir
@spec ensure_loaded() :: :ok
```

Ensure the collation tables are loaded into persistent term storage.

When the NIF backend is available and all options are NIF-compatible,
the collation tables are not needed and are not loaded automatically.
The tables are only loaded on demand when the Elixir backend is used
(for `sort_key/2`, locale tailoring, or when `backend: :elixir` is
specified).

This function can be called explicitly to pre-warm the tables at
application startup if you know you will use the Elixir backend.

### Returns

* `:ok` - tables are loaded and ready.

### Examples

    iex> Cldr.Collation.ensure_loaded()
    :ok

# `sort`

```elixir
@spec sort([String.t()], keyword() | Cldr.Collation.Options.t()) :: [String.t()]
```

Sort a list of strings using the CLDR collation algorithm.

### Arguments

* `strings` - a list of UTF-8 strings to sort.

* `options` - a keyword list of collation options.

### Options

Accepts the same options as `compare/3`.

### Returns

A new list of strings sorted according to the CLDR collation rules.

### Examples

    iex> Cldr.Collation.sort(["café", "cafe", "Cafe"])
    ["cafe", "Cafe", "café"]

    iex> Cldr.Collation.sort(["б", "а", "в"])
    ["а", "б", "в"]

# `sort_key`

```elixir
@spec sort_key(
  String.t() | [non_neg_integer()],
  keyword() | Cldr.Collation.Options.t()
) :: binary()
```

Generate a binary sort key for the given input.

Sort keys can be compared directly with `<`, `>`, `==` for ordering.
This is efficient when the same strings need to be compared multiple times.

### Arguments

* `input` - a UTF-8 string or a list of integer codepoints.

* `options` - a keyword list of collation options, or a `t:Cldr.Collation.Options.t/0`
   struct.

### Options

Accepts the same options as `compare/3`.

### Returns

A binary sort key that can be compared with standard binary
comparison operators.

### Examples

    iex> key_a = Cldr.Collation.sort_key("cafe")
    iex> key_b = Cldr.Collation.sort_key("café")
    iex> key_a < key_b
    true

    iex> Cldr.Collation.sort_key("hello") == Cldr.Collation.sort_key("hello")
    true

---

*Consult [api-reference.md](api-reference.md) for complete listing*
