Phase 2 validator for email-address candidates.
Takes a scanner candidate span and validates it as an email address
per RFC 5322 §3.4.1 (dot-atom local part) and RFC 6531 (SMTPUTF8 /
EAI — internationalised local parts) when the :eai option is on.
The host part is validated identically to URL hosts: per-label IDNA
via Unicode.IDNA.to_ascii/2, then TLD lookup against the bundled
IANA list.
Returns a structured %{} record on success or {:error, reason} on
rejection.
Summary
Functions
Validates an email candidate span.
Types
@type email_record() :: %{ email: String.t(), ascii: String.t(), span: {non_neg_integer(), non_neg_integer()}, local: String.t(), host: String.t(), ascii_host: String.t() }
Parsed email record.
@type reason() ::
:empty
| :no_local
| :no_host
| :invalid_local
| :invalid_host
| :idna_failed
| :invalid_tld
Reasons for rejecting an email candidate.
Functions
@spec validate(String.t(), {non_neg_integer(), non_neg_integer()}, keyword()) :: {:ok, email_record()} | {:error, reason()}
Validates an email candidate span.
Arguments
candidateis the candidate substring as emitted byText.Extract.Scanner.scan/1.spanis the{start_byte, length_bytes}tuple positioningcandidatewithin the original source text — preserved through to the returned record's:spanfield.
Options
:eai— whentrue, allow non-ASCII codepoints in the local part per RFC 6531 SMTPUTF8. Defaulttrue.:tld_mode—:iana(default) or:any.:strict_idn— whentrue, IDNA uses STD3 ASCII rules. Defaultfalse.
Returns
{:ok, record}on success.{:error, reason}on rejection.
Examples
iex> {:ok, r} = Text.Extract.Email.validate("alice@example.com", {0, 17})
iex> {r.local, r.host, r.span}
{"alice", "example.com", {0, 17}}
iex> Text.Extract.Email.validate("alice@example.fake", {0, 18})
{:error, :invalid_tld}