Unicode.IDNA.Punycode (Unicode IDNA v0.1.0)

View Source

RFC 3492 Punycode encoding and decoding.

Punycode is a uniquely-decodable bootstring encoding that represents arbitrary Unicode code points using only the ASCII letters, digits and hyphen. It is the encoding used by IDNA to represent internationalized domain labels in their ASCII form (the xn-- prefix).

These functions operate on a single label (without the xn-- prefix) and are the building block for Unicode.IDNA.to_ascii/2 and Unicode.IDNA.to_unicode/2. Most callers should use those higher-level functions; this module is exposed so that callers needing the raw RFC 3492 primitives do not have to re-implement them.

Summary

Functions

Decodes a Punycode label back to its original Unicode form.

Encodes a string of Unicode code points as a Punycode label.

Functions

decode(string)

@spec decode(String.t()) :: {:ok, String.t()} | {:error, :invalid_input | :overflow}

Decodes a Punycode label back to its original Unicode form.

Arguments

  • string is a binary containing only ASCII letters, digits and hyphen.

Returns

  • {:ok, decoded} on success.

  • {:error, :invalid_input} if the input contains characters outside the Punycode alphabet.

  • {:error, :overflow} if decoding would overflow the RFC 3492 integer arithmetic.

Examples

iex> Unicode.IDNA.Punycode.decode("bcher-kva")
{:ok, "bücher"}

iex> Unicode.IDNA.Punycode.decode("mnchen-3ya")
{:ok, "münchen"}

iex> Unicode.IDNA.Punycode.decode("abc-")
{:ok, "abc"}

encode(string)

@spec encode(String.t()) :: {:ok, binary()} | {:error, :overflow}

Encodes a string of Unicode code points as a Punycode label.

Arguments

Returns

  • {:ok, encoded} where encoded is a binary containing only ASCII letters, digits and hyphen.

  • {:error, :overflow} if encoding would overflow the RFC 3492 32-bit integer arithmetic. This can only occur for pathological inputs.

Examples

iex> Unicode.IDNA.Punycode.encode("bücher")
{:ok, "bcher-kva"}

iex> Unicode.IDNA.Punycode.encode("münchen")
{:ok, "mnchen-3ya"}

iex> Unicode.IDNA.Punycode.encode("abc")
{:ok, "abc-"}