Module z_html

Utility functions for html processing.

Copyright © 2009-2020 Marc Worrell

Authors: Marc Worrell (marc@worrell.nl).

Description

Utility functions for html processing. Also used for property filtering (by m_rsc_update).

Data Types

maybe_binary()

maybe_binary() = undefined | binary()

maybe_iodata()

maybe_iodata() = undefined | iodata()

maybe_text()

maybe_text() = undefined | text()

sanitize_option()

sanitize_option() = {elt_extra, [binary()]} | {attr_extra, [binary()]}

sanitize_options()

sanitize_options() = [sanitize_option()]

text()

text() = iodata() | {trans, [{atom(), binary()}]}

Function Index

abs_links/2Make all links (href/src) in the html absolute to the base URL This takes a shortcut by checking all ' (src|href)=".."'.
br2nl/1Translate any html br entities to newlines.
ensure_escaped_amp/1Ensure that &-characters are properly escaped inside a html string.
escape/1Escape a string so that it is valid within HTML/ XML.
escape_check/1Escape a string so that it is valid within HTML/ XML.
escape_html_comment/2Escape smaller-than, greater-than (for in comments).
escape_html_text/2Escape smaller-than, greater-than, single and double quotes in texts (& is already removed or escaped).
escape_link/1Escape a text.
escape_props/1Escape all properties used for an update statement.
escape_props/2
escape_props_check/1Checks if all properties are properly escaped.
escape_props_check/2
flatten_attr/1Flatten an attribute to a binary, filter urls and css.
nl2br/1Translate any newlines to html br entities.
noscript/1Filter a url, remove any "javascript:" and "data:" (as data can be text/html).
sanitize/1Sanitize a (X)HTML string.
sanitize/2
sanitize/4Sanitize a mochiwebparse tree.
sanitize_uri/1Ensure that an uri is (quite) harmless by removing any script reference.
scrape_link_elements/1Given a HTML list, scrape all <link> elements and return their attributes.
strip/1Strip all html elements from the text.
truncate/2Truncate a previously sanitized HTML string.
truncate/3
unescape/1Unescape - reverses the effect of escape.

Function Details

abs_links/2

abs_links(Html::maybe_iodata(), Base::binary()) -> iodata()

Make all links (href/src) in the html absolute to the base URL This takes a shortcut by checking all ' (src|href)=".."'

br2nl/1

br2nl(B::maybe_text()) -> maybe_text()

Translate any html br entities to newlines.

ensure_escaped_amp/1

ensure_escaped_amp(B::maybe_binary()) -> binary()

Ensure that &-characters are properly escaped inside a html string.

escape/1

escape(L::maybe_text()) -> maybe_text()

Escape a string so that it is valid within HTML/ XML.

escape_check/1

escape_check(L::maybe_text()) -> maybe_text()

Escape a string so that it is valid within HTML/ XML.

escape_html_comment/2

escape_html_comment(X1, Acc) -> any()

Escape smaller-than, greater-than (for in comments)

escape_html_text/2

escape_html_text(X1, Acc) -> any()

Escape smaller-than, greater-than, single and double quotes in texts (& is already removed or escaped).

escape_link/1

escape_link(Text::maybe_iodata()) -> maybe_binary()

Escape a text. Expands any urls to links with a nofollow attribute.

escape_props/1

escape_props(Props::list() | map()) -> list() | map()

Escape all properties used for an update statement. Only leaves the body property intact.

escape_props/2

escape_props(Props::list() | map(), Options::list()) -> list() | map()

escape_props_check/1

escape_props_check(Props::list() | map()) -> list() | map()

Checks if all properties are properly escaped

escape_props_check/2

escape_props_check(Props::list() | map(), Options::list()) -> list() | map()

flatten_attr/1

flatten_attr(X1) -> any()

Flatten an attribute to a binary, filter urls and css.

nl2br/1

nl2br(B::maybe_text()) -> maybe_text()

Translate any newlines to html br entities.

noscript/1

noscript(Url) -> any()

Filter a url, remove any "javascript:" and "data:" (as data can be text/html).

sanitize/1

sanitize(Html::maybe_text()) -> maybe_text()

Sanitize a (X)HTML string. Remove elements and attributes that might be harmful.

sanitize/2

sanitize(Html::maybe_text(), Options::sanitize_options()) -> maybe_text()

sanitize/4

sanitize(ParseTree::z_html_parse:html_element(), ExtraElts::binary() | list(), ExtraAttrs::binary() | list(), Options::any()) -> z_html_parse:html_element()

Sanitize a mochiwebparse tree. Remove harmful elements and attributes.

sanitize_uri/1

sanitize_uri(Uri::maybe_iodata()) -> maybe_binary()

Ensure that an uri is (quite) harmless by removing any script reference

scrape_link_elements/1

scrape_link_elements(Html::iodata()) -> [[z_html_parse:html_attr()]]

Given a HTML list, scrape all <link> elements and return their attributes. Attribute names are lowercased.

strip/1

strip(Html::maybe_text()) -> maybe_text()

Strip all html elements from the text. Simple parsing is applied to find the elements. Does not escape the end result.

truncate/2

truncate(Html::maybe_text(), Length::integer()) -> maybe_text()

Truncate a previously sanitized HTML string.

truncate/3

truncate(Html::maybe_text(), Length::integer(), Append::iodata()) -> maybe_text()

unescape/1

unescape(L::maybe_text()) -> maybe_text()

Unescape - reverses the effect of escape.


Generated by EDoc