strip_js v0.5.0 StripJs

StripJs is an Elixir module for stripping executable JavaScript from blocks of HTML. It removes <script> tags, javascript:... links, and event handlers like onclick as follows:

  • <script>...</script> and <script src="..."></script> tags are removed entirely.

  • <a href="javascript:..."> is converted to <a href="#" data-href-javascript="...">.

  • Event handler attributes such as onclick="..." are converted to e.g., data-onclick="...".

Installation

Add strip_js to your application’s dependencies in mix.exs:

def deps do
  [{:strip_js, "~> 0.5.0"}]
end

Usage

strip_js/1 returns a copy of its input, with all JS removed.

iex> html = "<button onclick=\"alert('pwnt')\">Hi!</button>"
iex> StripJs.strip_js(html)
"<button data-onclick=\"alert('pwnt')\">Hi!</button>"

strip_js_with_status/1 performs the same function as strip_js/1, also returning a boolean indicating whether any JS was removed from the input.

iex> html = "<button onclick=\"alert('pwnt')\">Hi!</button>"
iex> StripJs.strip_js_with_status(html)
{"<button data-onclick=\"alert('pwnt')\">Hi!</button>", true}

StripJs relies on the Floki HTML parser library. StripJs provides a strip_js_from_tree/1 function to strip JS from Floki HTML parse trees.

Similar packages

phoenix_html_sanitizer, based on html_sanitize_ex, provides similar functionality with its :full_html mode. However, in addition to using the Phoenix.HTML.Safe protocol (returning tuples like {:safe, string}), phoenix_html_sanitizer maintains the contents of script tags, effectively pasting deactivated JS into the DOM. StripJs improves on this behavior by removing the contents of script tags entirely.

Link to this section Summary

Functions

Returns a copy of the given HTML string with all JS removed

Returns a copy of the given Floki HTML tree with all JS removed

Returns a tuple containing a copy of the given HTML string with all JS removed, as well as a boolean that is true when there was JS present in the original HTML and false otherwise

Link to this section Functions

Link to this function strip_js(html)
strip_js(String.t) :: String.t

Returns a copy of the given HTML string with all JS removed.

Even if the input HTML contained no JS, it may not match the output byte-for-byte.

Link to this function strip_js_from_tree(tree)
strip_js_from_tree(Floki.html_tree) :: Floki.html_tree

Returns a copy of the given Floki HTML tree with all JS removed.

Link to this function strip_js_with_status(html)
strip_js_with_status(String.t) :: {String.t, boolean}

Returns a tuple containing a copy of the given HTML string with all JS removed, as well as a boolean that is true when there was JS present in the original HTML and false otherwise.

Even if the input HTML contained no JS, it may not match the output byte-for-byte.