A high-performance HTML to text converter using Rust NIF.
Two conversion modes are available:
convert/2— plain text with markdown-like decorations (**bold**,*italic*, link footnotes)convert_rich/2— structured{text, annotations}tuples for building custom renderers (Slack, Discord, etc.)
Additionally, HTML2Text.HTML is a container struct whose Inspect protocol renders HTML as
formatted text with ANSI styles directly in IEx.
HTML container
Wrap HTML in HTML2Text.HTML.new/1 to get readable output when inspecting data structures:
email = %{subject: "Welcome", body: HTML2Text.HTML.new("<p>Hello <strong>world</strong></p>")}In IEx this prints as:
%{subject: "Welcome", body: #HTML2Text.HTML<Hello **world**>}Bold, italic, links (clickable in supported terminals), code, strikeout, and CSS
colours are rendered with ANSI escape sequences. to_string/1 returns the original HTML.
Summary
Functions
Converts HTML content to plain text.
Converts HTML content to plain text, raising on failure.
Converts HTML content to annotated rich text.
Converts HTML content to annotated rich text, raising on failure.
Types
@type annotation() :: :default | :emphasis | :strong | :strikeout | :code | {:link, url :: String.t()} | {:image, src :: String.t()} | {:preformat, continuation :: boolean()} | {:colour, {r :: non_neg_integer(), g :: non_neg_integer(), b :: non_neg_integer()}} | {:bg_colour, {r :: non_neg_integer(), g :: non_neg_integer(), b :: non_neg_integer()}}
@type line() :: [segment()]
@type opts() :: [ width: pos_integer() | :infinity, decorate: boolean(), link_footnotes: boolean(), table_borders: boolean(), pad_block_width: boolean(), allow_width_overflow: boolean(), min_wrap_width: pos_integer(), raw: boolean(), wrap_links: boolean(), unicode_strikeout: boolean(), empty_img_mode: :ignore | {:replace, String.t()} | :filename ]
@type rich_opts() :: [ width: pos_integer() | :infinity, table_borders: boolean(), pad_block_width: boolean(), allow_width_overflow: boolean(), min_wrap_width: pos_integer(), raw: boolean(), wrap_links: boolean(), empty_img_mode: :ignore | {:replace, String.t()} | :filename, use_doc_css: boolean(), css: String.t() ]
@type segment() :: {text :: String.t(), annotations :: [annotation()]}
Functions
@spec convert(html :: String.t(), opts()) :: {:ok, text :: String.t()} | {:error, reason :: String.t()}
Converts HTML content to plain text.
Options
:width— Maximum line width (positive integer or:infinity). Defaults to80. Setting to:infinitydisables line wrapping and outputs the entire text on a single line.:decorate— Enables text decorations like bold or italic. Boolean, defaults totrue. Whenfalse, output is plain text without styling.:link_footnotes— Adds numbered link footnotes at the end of the text. Boolean, defaults totrue. Whenfalse, links are omitted.:table_borders— Shows ASCII borders around table cells. Boolean, defaults totrue. Whenfalse, tables render without borders.:pad_block_width— Pads blocks with spaces to align text to full width. Boolean, defaults tofalse. Useful for fixed-width layouts.:allow_width_overflow— Allows lines to exceed the specified width if wrapping is impossible. Boolean, defaults tofalse. Prevents errors when content can't fit.:min_wrap_width— Minimum length of text chunks when wrapping lines. Integer ≥ 1, defaults to3. Helps avoid awkwardly narrow wraps.:raw— Enables raw mode with minimal processing and formatting. Boolean, defaults tofalse. Produces plain, raw text output.:wrap_links— Wraps long URLs or links onto multiple lines. Boolean, defaults totrue. Whenfalse, links stay on a single line and may overflow.:unicode_strikeout— Uses Unicode characters for strikeout text. Boolean, defaults totrue. Whenfalse, strikeout renders in simpler styles.:empty_img_mode— Controls how images without alt text are rendered. Accepts:ignore(skip images without alt text, default),{:replace, text}(replace with static text like"[image]"), or:filename(use the image filename from URL).
Examples
iex> html = "<h1>Title</h1><p>Some paragraph text.</p>"
...> HTML2Text.convert(html, width: 15)
{:ok, "# Title\n\nSome paragraph\ntext.\n"}
iex> HTML2Text.convert("<b>Important</b>", decorate: false)
{:ok, "Important\n"}
iex> HTML2Text.convert("<table><tr><td>A</td><td>B</td></tr></table>", [])
{:ok, "─┬─\nA│B\n─┴─\n"}
iex> HTML2Text.convert("<p><a href=\"https://example.com\">link</a></p>", link_footnotes: false)
{:ok, "[link]\n"}
Converts HTML content to plain text, raising on failure.
This function behaves like convert/2, but raises an error if conversion fails.
Examples
iex> HTML2Text.convert!("<p>hello</p>")
"hello\n"
iex> HTML2Text.convert!("<em>italic</em>")
"*italic*\n"
@spec convert_rich(html :: String.t(), rich_opts()) :: {:ok, [line()]} | {:error, reason :: String.t()}
Converts HTML content to annotated rich text.
Returns a list of lines, where each line is a list of {text, annotations} tuples.
Annotations are stacked — a text segment inside <strong><a href="..."> will have
[{:link, url}, :strong], with the outer annotation first.
Options
:width— Maximum line width (positive integer or:infinity). Defaults to80.:table_borders— Shows ASCII borders around table cells. Boolean, defaults totrue.:pad_block_width— Pads blocks with spaces to align text to full width. Boolean, defaults tofalse.:allow_width_overflow— Allows lines to exceed the specified width. Boolean, defaults tofalse.:min_wrap_width— Minimum length of text chunks when wrapping. Integer ≥ 1, defaults to3.:raw— Enables raw mode with minimal processing. Boolean, defaults tofalse.:wrap_links— Wraps long URLs onto multiple lines. Boolean, defaults totrue.:empty_img_mode— Controls how images without alt text are rendered. Accepts:ignore(default),{:replace, text}, or:filename.:use_doc_css— Parse<style>tags from the HTML to extract colour annotations. Boolean, defaults tofalse.:css— Additional CSS rules to apply. String, defaults tonil.
Annotations
:default— Normal text:emphasis—<em>tag:strong—<strong>/<b>tag:strikeout—<s>/<del>tag:code—<code>tag{:link, url}—<a href="...">tag{:image, src}—<img src="...">tag{:preformat, bool}—<pre>block (trueif continuation line){:colour, {r, g, b}}— CSS text color{:bg_colour, {r, g, b}}— CSS background color
Examples
iex> HTML2Text.convert_rich("<p>Hello <strong>world</strong></p>")
{:ok, [[{"Hello ", []}, {"world", [:strong]}]]}
iex> HTML2Text.convert_rich("<em>text</em>")
{:ok, [[{"text", [:emphasis]}]]}
iex> HTML2Text.convert_rich(~s(<a href="https://example.com">click</a>))
{:ok, [[{"click", [link: "https://example.com"]}]]}
iex> HTML2Text.convert_rich(~s(<a href="https://ex.com"><strong>bold link</strong></a>))
{:ok, [[{"bold link", [{:link, "https://ex.com"}, :strong]}]]}
Converts HTML content to annotated rich text, raising on failure.
This function behaves like convert_rich/2, but raises an error if conversion fails.
Examples
iex> HTML2Text.convert_rich!("<p>hello</p>")
[[{"hello", []}]]
iex> HTML2Text.convert_rich!("<code>x = 1</code>")
[[{"x = 1", [:code]}]]