strip_js v0.5.0 StripJs
StripJs is an Elixir module for stripping executable JavaScript from
blocks of HTML. It removes <script>
tags, javascript:...
links,
and event handlers like onclick
as follows:
<script>...</script>
and<script src="..."></script>
tags are removed entirely.<a href="javascript:...">
is converted to<a href="#" data-href-javascript="...">
.Event handler attributes such as
onclick="..."
are converted to e.g.,data-onclick="..."
.
Installation
Add strip_js
to your application’s dependencies in mix.exs
:
def deps do
[{:strip_js, "~> 0.5.0"}]
end
Usage
strip_js/1
returns a copy of its input, with all JS removed.
iex> html = "<button onclick=\"alert('pwnt')\">Hi!</button>"
iex> StripJs.strip_js(html)
"<button data-onclick=\"alert('pwnt')\">Hi!</button>"
strip_js_with_status/1
performs the same function as strip_js/1
,
also returning a boolean indicating whether any JS was removed from
the input.
iex> html = "<button onclick=\"alert('pwnt')\">Hi!</button>"
iex> StripJs.strip_js_with_status(html)
{"<button data-onclick=\"alert('pwnt')\">Hi!</button>", true}
StripJs relies on the Floki
HTML parser library. StripJs provides a strip_js_from_tree/1
function to strip JS from Floki HTML parse trees.
Similar packages
phoenix_html_sanitizer,
based on html_sanitize_ex,
provides similar functionality with its :full_html
mode.
However, in addition to using the Phoenix.HTML.Safe protocol (returning
tuples like {:safe, string}
), phoenix_html_sanitizer maintains the
contents of script
tags, effectively pasting deactivated JS into the DOM.
StripJs improves on this behavior by removing the contents of script
tags
entirely.
Link to this section Summary
Functions
Returns a copy of the given HTML string with all JS removed
Returns a copy of the given Floki HTML tree with all JS removed
Returns a tuple containing a copy of the given HTML string with
all JS removed, as well as a boolean that is true
when there was
JS present in the original HTML and false
otherwise
Link to this section Functions
Returns a copy of the given HTML string with all JS removed.
Even if the input HTML contained no JS, it may not match the output byte-for-byte.
strip_js_from_tree(Floki.html_tree) :: Floki.html_tree
Returns a copy of the given Floki HTML tree with all JS removed.
Returns a tuple containing a copy of the given HTML string with
all JS removed, as well as a boolean that is true
when there was
JS present in the original HTML and false
otherwise.
Even if the input HTML contained no JS, it may not match the output byte-for-byte.