Module z_html_parse

Loosely tokenizes and generates parse trees for HTML 4.

Copyright © 2007 Mochi Media, Inc.; copyright 2018-2020 Maas-Maarten Zeeman

Authors: Bob Ippolito (bob@mochimedia.com).

Description

Loosely tokenizes and generates parse trees for HTML 4. Adapted by Maas-Maarten Zeeman

Data Types

end_tag()

end_tag() = {end_tag, Name::binary()}

html_attr()

html_attr() = {binary(), binary()}

html_comment()

html_comment() = {comment, Comment::binary()}

html_data()

html_data() = {data, binary(), Whitespace::boolean()}

html_doctype()

html_doctype() = {doctype, [Doctype::any()]}

html_element()

html_element() = html_node() | html_comment() | html_nop() | pi_tag() | binary()

html_node()

html_node() = {binary(), [html_attr()], [html_element()]}

html_nop()

html_nop() = {nop, [html_element()]}

Special node used by sanitizer for unwanted elements

html_token()

html_token() = html_data() | start_tag() | end_tag() | pi_tag() | inline_html() | html_comment() | html_doctype()

inline_html()

inline_html() = {'=', binary()}

pi_tag()

pi_tag() = {pi, binary()} | {pi, Tag::binary(), [html_attr()]}

start_tag()

start_tag() = {start_tag, Name::binary(), [html_attr()], Singleton::boolean()}

Function Index

escape/1Escape a string such that it's safe for HTML (amp; lt; gt;).
escape_attr/1Escape a string such that it's safe for HTML attrs (amp; lt; gt; quot;).
parse/1tokenize and then transform the token stream into a HTML tree.
parse_tokens/1Transform the output of tokens(Doc) into a HTML tree.
to_html/1Convert a list of html_token() to a HTML document.
to_tokens/1Convert a html_node() tree to a list of tokens.
tokens/1Transform the input UTF-8 HTML into a token stream.

Function Details

escape/1

escape(B::string() | atom() | binary()) -> binary()

Escape a string such that it's safe for HTML (amp; lt; gt;).

escape_attr/1

escape_attr(B::string() | binary() | atom() | integer() | float()) -> binary()

Escape a string such that it's safe for HTML attrs (amp; lt; gt; quot;).

parse/1

parse(Input::iodata()) -> {ok, html_node()} | {error, nohtml}

tokenize and then transform the token stream into a HTML tree.

parse_tokens/1

parse_tokens(Tokens::[html_token()]) -> {ok, html_node()} | {error, nohtml}

Transform the output of tokens(Doc) into a HTML tree.

to_html/1

to_html(Node::[html_token()] | html_node()) -> iolist()

Convert a list of html_token() to a HTML document.

to_tokens/1

to_tokens(T::html_node()) -> [html_token()]

Convert a html_node() tree to a list of tokens.

tokens/1

tokens(Input::StringOrBinary) -> [html_token()]

Transform the input UTF-8 HTML into a token stream.


Generated by EDoc