idna (idna v7.1.0)

View Source

A pure Erlang IDNA implementation.

This module provides functions to encode and decode Internationalized Domain Names (IDN) using the IDNA protocol as defined in RFC 5891.

Features

  • Support for IDNA 2008 and IDNA 2003
  • UTS #46 compatibility processing
  • Label validation (NFC, hyphens, combining marks, context rules, BIDI)

Basic Usage

   %% Encode a domain name to ASCII (Punycode)
   "xn--nxasmq5b.com" = idna:encode("βόλος.com").
  
   %% Decode an ASCII domain name to Unicode
   "βόλος.com" = idna:decode("xn--nxasmq5b.com").
  
   %% Use UTS #46 processing
   "xn--fa-hia.de" = idna:encode("faß.de", [uts46]).

Summary

Types

A full domain name (may contain multiple labels separated by dots).

IDNA processing options.

List of IDNA processing options.

A single label (part between dots) of a domain name.

Functions

Convert a label to its ASCII-compatible encoding (A-label).

Check contextual rules for characters in a label.

Check that a label conforms to hyphen placement rules.

Check that a label does not begin with a combining mark.

Validate a domain label with default settings.

Validate a domain label with configurable checks.

Check that a label does not exceed the maximum length.

Check that a label is in Unicode Normalization Form C (NFC).

Decode an ASCII domain name to Unicode using the IDNA protocol.

Decode an ASCII domain name to Unicode using the IDNA protocol with options.

Encode a domain name to ASCII using the IDNA protocol.

Encode a domain name to ASCII using the IDNA protocol with options.

Decode an ASCII domain name to Unicode (compatibility API).

to_ascii(Domain) deprecated

Encode a domain name to ASCII (compatibility API).

Decode an ASCII domain name to Unicode (compatibility API).

Convert a label to its Unicode form (U-label).

Convert a UTF-8 binary domain to ASCII.

Types

domain/0

-type domain() :: string().

A full domain name (may contain multiple labels separated by dots).

idna_flag/0

-type idna_flag() ::
          uts46 |
          {uts46, boolean()} |
          std3_rules |
          {std3_rules, boolean()} |
          transitional |
          {transitional, boolean()} |
          strict |
          {strict, boolean()}.

IDNA processing options.

  • uts46 - Enable UTS #46 compatibility processing (default: false)
  • std3_rules - Enforce STD3 ASCII rules (default: false)
  • transitional - Use transitional processing for IDNA 2003 compatibility (default: false)
  • strict - Use strict dot separator (only ASCII period) (default: false)

idna_flags/0

-type idna_flags() :: [idna_flag()].

List of IDNA processing options.

label/0

-type label() :: string().

A single label (part between dots) of a domain name.

Functions

alabel(Label)

-spec alabel(Label) -> ALabel when Label :: label(), ALabel :: label().

Convert a label to its ASCII-compatible encoding (A-label).

Takes a Unicode label and returns its Punycode-encoded form with the "xn--" ACE prefix. If the label is already ASCII, it is validated and returned as-is.

Examples

  "xn--nxasmq5b" = idna:alabel("βόλος").
  "example" = idna:alabel("example").

Exits with {bad_label, Reason} if the label is invalid.

check_context(Label)

-spec check_context(Label) -> ok when Label :: label().

Check contextual rules for characters in a label.

Validates that all characters in the label are either PVALID (protocol valid) or pass their contextual rules (CONTEXTJ/CONTEXTO) as defined in RFC 5892.

Exits with {bad_label, {context, Reason}} if validation fails.

check_hyphen(Label)

-spec check_hyphen(Label) -> ok when Label :: label().

Check that a label conforms to hyphen placement rules.

Validates that the label does not have hyphens in the 3rd and 4th positions (which would indicate an ACE prefix) and does not start or end with a hyphen.

See RFC 5891 Section 4.2.3.1.

Exits with {bad_label, {hyphen, Reason}} if validation fails.

check_initial_combiner(Label)

-spec check_initial_combiner(Label) -> ok when Label :: label().

Check that a label does not begin with a combining mark.

Validates that the label does not start with a combining character as required by RFC 5891 Section 4.2.3.2.

Exits with {bad_label, {initial_combiner, Reason}} if validation fails.

check_label(Label)

-spec check_label(Label) -> ok when Label :: label().

Validate a domain label with default settings.

Equivalent to check_label(Label, true, true, true).

Performs all IDNA validation checks: NFC normalization, hyphen rules, initial combiner, context rules, and BIDI rules.

check_label(Label, CheckHyphens, CheckJoiners, CheckBidi)

-spec check_label(Label, CheckHyphens, CheckJoiners, CheckBidi) -> ok
                     when
                         Label :: label(),
                         CheckHyphens :: boolean(),
                         CheckJoiners :: boolean(),
                         CheckBidi :: boolean().

Validate a domain label with configurable checks.

Validates that a label conforms to IDNA requirements. The following checks can be enabled or disabled:

  • CheckHyphens - Check hyphen placement rules
  • CheckJoiners - Check CONTEXTJ/CONTEXTO rules
  • CheckBidi - Check bidirectional text rules (RFC 5893)

NFC normalization and initial combiner checks are always performed.

Exits with {bad_label, {Reason, Message}} if validation fails.

check_label_length(Label)

-spec check_label_length(Label) -> ok when Label :: label().

Check that a label does not exceed the maximum length.

Labels in DNS are limited to 63 octets. This function validates that the label length does not exceed this limit.

Exits with {bad_label, {too_long, Reason}} if validation fails.

check_nfc(Label)

-spec check_nfc(Label) -> ok when Label :: label().

Check that a label is in Unicode Normalization Form C (NFC).

Validates that the label is properly normalized according to RFC 5891 Section 4.2.1.

Exits with {bad_label, {nfc, Reason}} if validation fails.

decode(AsciiDomain)

-spec decode(AsciiDomain) -> Domain when AsciiDomain :: domain(), Domain :: domain().

Decode an ASCII domain name to Unicode using the IDNA protocol.

Equivalent to decode(Domain, []).

decode(AsciiDomain, Options)

-spec decode(AsciiDomain, Options) -> Domain
                when AsciiDomain :: domain(), Options :: idna_flags(), Domain :: domain().

Decode an ASCII domain name to Unicode using the IDNA protocol with options.

Converts an ASCII-compatible encoding (ACE) domain name back to its Unicode representation.

Options

Same options as encode/2.

Examples

  %% Basic decoding
  "βόλος.com" = idna:decode("xn--nxasmq5b.com").
 
  %% Decode with UTS #46 processing
  "faß.de" = idna:decode("xn--fa-hia.de", [uts46]).

encode(Domain)

-spec encode(Domain) -> AsciiDomain when Domain :: domain(), AsciiDomain :: domain().

Encode a domain name to ASCII using the IDNA protocol.

Equivalent to encode(Domain, []).

encode(Domain, Options)

-spec encode(Domain, Options) -> AsciiDomain
                when Domain :: domain(), Options :: idna_flags(), AsciiDomain :: domain().

Encode a domain name to ASCII using the IDNA protocol with options.

Converts an Internationalized Domain Name to its ASCII-compatible encoding (ACE) form using Punycode.

Options

  • uts46 - Enable UTS #46 compatibility processing. This maps characters according to the IDNA Mapping Table before encoding.
  • std3_rules - Enforce STD3 ASCII rules (disallow certain characters).
  • transitional - Use transitional processing for backward compatibility with IDNA 2003. For example, maps ß to ss.
  • strict - Only use ASCII period (.) as label separator instead of also accepting fullwidth and ideographic periods.

Examples

  %% Basic encoding
  "xn--nxasmq5b.com" = idna:encode("βόλος.com").
 
  %% With UTS #46 processing
  "xn--fa-hia.de" = idna:encode("faß.de", [uts46]).
 
  %% With transitional processing (ß -> ss)
  "fass.de" = idna:encode("faß.de", [uts46, transitional]).

from_ascii(AsciiDomain)

This function is deprecated. Use decode/1 instead..
-spec from_ascii(AsciiDomain) -> Domain when AsciiDomain :: domain(), Domain :: domain().

Decode an ASCII domain name to Unicode (compatibility API).

This function is provided for backward compatibility. It is equivalent to decode/1.

to_ascii(Domain)

This function is deprecated. Use encode/1 instead..
-spec to_ascii(Domain) -> AsciiDomain when Domain :: domain(), AsciiDomain :: domain().

Encode a domain name to ASCII (compatibility API).

This function is provided for backward compatibility with older IDNA libraries. It is equivalent to encode/1.

to_unicode(AsciiDomain)

This function is deprecated. Use decode/1 instead..
-spec to_unicode(AsciiDomain) -> Domain when AsciiDomain :: domain(), Domain :: domain().

Decode an ASCII domain name to Unicode (compatibility API).

This function is provided for backward compatibility with older IDNA libraries. It is equivalent to decode/1.

ulabel(ALabel)

-spec ulabel(ALabel) -> Label when ALabel :: label(), Label :: label().

Convert a label to its Unicode form (U-label).

Takes an ASCII label (potentially Punycode-encoded with "xn--" prefix) and returns its Unicode representation. The result is validated against IDNA rules.

Examples

  "βόλος" = idna:ulabel("xn--nxasmq5b").
  "example" = idna:ulabel("example").

Exits with {bad_label, Reason} if the label is invalid.

utf8_to_ascii(Utf8Domain)

This function is deprecated. Use encode/1 with proper Unicode string instead..
-spec utf8_to_ascii(Utf8Domain) -> AsciiDomain
                       when Utf8Domain :: binary() | string(), AsciiDomain :: domain().

Convert a UTF-8 binary domain to ASCII.

Converts the UTF-8 encoded domain to a Unicode string first, then encodes it to ASCII.