View Source Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

Unreleased

0.36.2 - 2024-04-26

Added

  • Implement the Inspect protocol for the Floki.HTMLTree struct. This struct is currently private. Thank you @vittoriabitton.

Fixed

0.36.1 - 2024-03-18

Fixed

  • Fix typespec of get_by_id/2.

0.36.0 - 2024-03-01

Added

  • Add Floki.get_by_id/1 that returns one element by ID or nil. Thanks @SteffenDE.

Changed

  • Improve options validation with Keyword.validate!/2. This is not a change in APIs, but the error messages and opts validation should be standardized now. Thanks @vittoriabitton.

Removed

  • Drop support for Elixir v1.12.

0.35.4 - 2024-02-19

Besides the fix described below, this release also contains more performance improvements, thanks to @ypconstante.

Fixed

0.35.3 - 2024-01-25

This release has great performance improvements, thanks to the PRs from @ypconstante!

Most of the main functions, such as Floki.raw_html/2 and Floki.find/2 are faster and are using less memory. It's something like twice as fast, and half usage of memory for find/2, for example.

Fixed

  • Add :leex to Mix compilers. Fixes the build when running with dev version of Elixir. Thanks @wojtekmach.

  • Fix Floki.raw_html/2 when a tree using attributes as maps is given. Thanks @SupaMic.

  • Add a guard to Floki.find/2 so people can have a better error message when an invalid input is given. Thanks @Hajto.

  • Fix parsers to consider IO data as inputs. This may change in the next version of Floki, as I plan to drop support for IO data. Thanks @ypconstante.

Removed

  • Remove outdated Gleam wrapper code. The external functions syntax in Gleam has changed. So now the wrapper is not needed anymore. Thanks @michallepicki.

0.35.2 - 2023-10-25

Fixed

  • Enable usage of IO data by removing a guard for binaries in the main parser module.

0.35.1 - 2023-10-16

Fixed

0.35.0 - 2023-10-13

Added

  • Add support for parsing attributes as maps.

    This makes parse_document/2 and parse_fragment/2 accept the option :attributes_as_maps to change the behaviour and return attributes as maps instead of lists of tuples. The only parser that does not support it yet is the fast_html.

Changed

  • Drop support for Elixir v1.11.

  • Change the log level of parsing logger calls from "info" to "debug". This will help to reduce the amount of noise in production apps.

0.34.3 - 2023-06-02

Added

  • Add boolean option :include_inputs to Floki.text/2 that changes the result of this function to include the values of inputs. So if there is any input with a "value" attribute, we now include that value if this option is set to true. Thanks @viniciusmuller.

Fixed

  • Fix find of elements by classes that contain colons. This is useful for when people are trying to find elements that contain Tailwind classes. Thanks @viniciusmuller.

  • Fix some typespecs that were using types from private modules. This is a fix to the documentation.

0.34.2 - 2023-02-24

Added

0.34.1 - 2023-02-11

Fixed

  • Fix pseudo-class ":not" selector parsing halting point. This is a fix for when a "pseudo-class" ":not" that contains an attribute selector is followed by another selector. This is an example: "a:not([class]), div".

  • Ignore decimal numeric char ref when number is negative.

0.34.0 - 2022-11-03

Added

  • User configurable "self-closing" tags. Now it's possible to define which tags are considered "self-closing". Thanks @inoas.

Fixed

  • Allow attribute values to not be escaped. This fixes Floki.raw_html/2 when used with the option encode: false. Thanks @juanazam.
  • Fix traverse_and_update/3 spec. Thanks @WLSF.

Changed

  • Drop support for Elixir 1.9 and 1.10.
  • Remove html_entities dependency. We now use an internal encoder/decoder for entities.
  • Change the main branch name to main.

0.33.1 - 2022-06-28

Fixed

  • Remove some warnings for unused code.

0.33.0 - 2022-06-28

Added

  • Add support for searching elements that contains text in a case-insensitive manner with fl-icontains - thanks @nuno84

Changed

  • Drop support for Elixir 1.8 and 1.9.
  • Fix and improve internal things - thanks @derek-zhou and @hissssst

0.32.1 - 2022-03-24

Fixed

  • Allow root nodes to be selected using pseudo-classes - thanks @rzane

0.32.0 - 2021-10-18

Added

  • Add an HTML tokenizer written in Elixir - this still experimental and it's not stable API yet.
  • Add support for HTML IDs containing periods in the selectors - thanks @Hugo-Hache
  • Add support for case-insensitive CSS attribute selectors - thanks @fcapovilla
  • Add the :root pseudo-class selector - thanks @fcapovilla

0.31.0 - 2021-06-11

Changed

  • Treat style and title tags as plaintext in Mochiweb - thanks @SweetMNM

0.30.1 - 2021-03-29

Fixed

0.30.0 - 2021-02-06

Added

Changed

  • Remove support for Elixir 1.7 - thanks @carlosfrodrigues
  • Replace IO.warn by Logger.info for deprecation warnings - thanks @juulSme

Fixed

  • Fix typespecs for find, attr and attribute functions - thanks @mtarnovan
  • Documentation Improvements - thanks @kianmeng

0.29.0 - 2020-10-02

Added

  • Add Floki.find_and_update/3 that updates nodes inside a tree, like traverse and update but without allowing changes in the children nodes. There for the tree cannot grow in size, but can have nodes removed.

Changed

Fixed

  • Fix a bug when parsing a HTML with a XML inside using Mochiweb's parser

Improvements

  • Add more typespecs

0.28.0 - 2020-08-26

Added

  • Add support for :checked pseudo-class selector - thanks @wojtekmach

Changed

  • Drop support for Elixir 1.6
  • Update version of fast_html to 2.0 in docs and CI - thanks @rinpatch

Fixed

  • Fix docs by mentioning HTML nodes supported for traverse_and_update - thanks @hubertlepicki

0.27.0 - 2020-07-07

Added

Fixed

Improvements

0.26.0 - 2020-02-17

Added

  • Add support for the pseudo-class selectors :nth-last-child and :nth-last-of-type

Fixed

Changed

  • Update optional dependency fast_html to v1.0.3

0.25.0 - 2020-01-26

Added

Changed

  • Update the html_entities dependency from v0.5.0 to v0.5.1

0.24.0 - 2020-01-01

Added

  • Add support for fast_html, which is a "C Node" wrapping Lexborisov's myhtml - thanks @rinpatch
  • Add setup to run our test suite against all parsers on CI - thanks @rinpatch
  • Add Floki.parse_document/1 and Floki.parse_fragment/1 in order to correct parse documents and fragments of documents - it also prevents the confusion and inconsistency of parse/1.
  • Configure dialyxir in order to run Dialyzer easily.

Changed

  • Deprecate Floki.parse/1 and all the functions that uses it underneath. This means that all the functions that accepted HTML as binary are deprecated as well. This includes find/2, attr/4, filter_out/2, text/2 and attribute/2. The recommendation is to use those functions with an already parsed document or fragment.
  • Remove support for Elixir 1.5.

0.23.1 - 2019-12-01

Fixed

  • It fixes the Mochiweb parser when there is an invalid charref.

0.23.0 - 2019-09-11

Changed

  • Remove Mochiweb as a hex dependency. It brings the code from the original project to Floki's codebase - thanks @josevalim

0.22.0 - 2019-08-21

Added

Changed

  • Remove support for Elixir 1.4.

0.21.0 - 2019-04-17

Added

Fixed

Changed

  • Drop support for Elixir 1.3 and below - thanks @herbstrith

0.20.4 - 2018-09-24

Fixed

  • Fix Floki.raw_html to accept lists as attribute values - thanks @katehedgpeth

0.20.3 - 2018-06-22

Fixed

0.20.2 - 2018-05-09

Fixed

0.20.1 - 2018-04-05

Fixed

0.20.0 - 2018-02-06

Added

  • Configurable raw_html/2 to allow optional encode of HTML entities - thanks @davydog187

Fixed

  • Fix serialization of the tree after updating attribute - thanks @francois2metz

0.19.3 - 2018-01-25

Fixed

  • Skip HTML entities encode for Floki.raw_html/1 for script or style tags
  • Add :html_entities app to the list of OTP applications. It fixes production releases.

0.19.2 - 2017-12-22

Fixed

0.19.1 - 2017-12-04

Fixed

0.19.0 - 2017-11-11

Added

  • Added support for nth-of-type, first-of-type, last-of-type and last-child pseudo-classes - thanks @saleem1337.
  • Added support for nth-child pseudo-class functional notation - thanks @nirev.
  • Added functional notation support for nth-of-type pseudo-class.
  • Added a Contributing guide.

Fixed

  • Format all files according to the Elixir 1.6 formatter - thanks @fcevado.
  • Fix Floki.raw_html to support raw text - thanks @craig-day.

0.18.1 - 2017-10-13

Added

Fixed

  • Fix XML tag when building HTML tree.
  • Return empty list when Floki.filter_out/2 result is empty.

0.18.0 - 2017-08-05

Added

  • Added Floki.attr/4 that receives a function enabling manipulation of attribute values - thanks @erikdsi.
  • Implement the String.Chars protocol for Floki.Selector.
  • Implement the Enumerable protocol for Floki.HTMLTree.

Changed

  • Changed Floki.transform/2 to Floki.map/2 and Floki.Finder.apply_transform/2 to Floki.Finder.map/2 - thanks @aphillipo.

Fixed

Removed

  • Removed support for Elixir 1.2.

0.17.2 - 2017-05-25

Fixed

0.17.1 - 2017-05-22

Fixed

  • Fix search when body has unencoded angles (< and >) - thanks @sergey-kintsel
  • Fix crash caused by XML declaration inside body - thanks @erikdsi
  • Fix issue when finding fails if HTML begins with XML tag - thanks @sergey-kintsel

0.17.0 - 2017-04-12

Added

  • Add support for multiple pseudo-selectors, line :not() and :nth-child() - thanks @jjcarstens
  • Add support for multiple selectors inside the :not() pseudo-class selector - thanks @jjcarstens

0.16.0 - 2017-04-05

Added

  • Add support for selectors that only include a pseudo-class selector - thanks @buhman
  • Add support for a new selector: fl-contains, which returns elements that contains a given text - thanks @buhman

Fixed

  • Fix :not() pseudo-class selector to accept simple pseudo-class selectors as well - thanks @mischov

0.15.0 - 2017-03-14

Added

  • Added support for the :not() pseudo-class selector.

Fixed

  • Fixed pseudo-class selectors that are used in conjunction with combinators - thanks @Eiji7
  • Fixed order of elements after search using descendant combinator - thanks @Eiji7

0.14.0 - 2017-02-07

Added

  • Added support for configuring html5ever as the HTML parser. Issue #83 - thanks @hansihe and @aphillipo!

0.13.2 - 2017-02-07

Fixed

  • Fixed bug that was causing Floki.text/1 and Floki.filter_out/2 to ignore "trees" with only text nodes. Issue #91 - thanks @boydm.

0.13.1 - 2017-01-22

Fixed

  • Fix ordering of duplicated descendant matches - thanks @mmmries
  • Fix ordering of Floki.text/1 when there are only root nodes - thanks @mmmries

0.13.0 - 2017-01-22

Added

  • Floki.filter_out/2 is now able to understand complex selectors to filter out from the tree.

0.12.1 - 2017-01-20

Fixed

  • Fix search for elements using descendant combinator - issue #84 - thanks @mmmries

0.12.0 - 2016-12-28

Added

  • Add basic support for nth-child pseudo-class selector. Closes issue #64.

Changed

  • Remove support for Elixir 1.1 and below.
  • Remove public documentation for internal code.

0.11.0 - 2016-10-12

Added

  • First attempt to transform nodes with Floki.transform/2. It is not able to update the tree yet, but works good with results from Floki.find/2 - thanks @bobjflong

Changed

  • Using Logger to notify unknown tokens in selector parser - thanks @teamon and @geonnave
  • Replace mochiweb_html with mochiweb package. This is needed to fix conflict with other packages that are using mochiweb. - thanks @aphillipo

0.10.1 - 2016-08-28

Fixed

  • Fix sibling search after immediate children - thanks @gmile.

0.10.0 - 2016-08-05

Changed

  • Change the search for namespaced elements using the correct CSS3 syntax.

Fixed

  • Fix the search for child elements when is more than two elements deep - thanks @gmile

0.9.0 - 2016-06-16

Added

  • A separator between text when getting text from nodes - thanks @rochdi.

0.8.1 - 2016-05-20

Added

Changed

  • Update Mochiweb HTML parser dependency to version 2.15.0.

0.8.0 - 2016-03-06

Added

  • Add possibility to search tags with namespaces.
  • Accept Floki.Selector as parameter of Floki.find/2 instead of only strings - thanks @hansihe.

Changed

  • Using a smaller package with only the Mochiweb HTML parser.

0.7.2 - 2016-02-23

Fixed

  • Replace <br> nodes by newline (\n) in DeepText - thanks @maxneuvians.
  • Allow FilterOut to filter special nodes, like comment.

0.7.1 - 2015-11-14

Fixed

  • Ignore PHP scripts when finding nodes.

0.7.0 - 2015-11-03

Added

  • Add support for excluding script notes in Floki.text. By default, it will exclude those nodes, but it can be enabled with the flag js: true - thanks @vikeri!

Fixed

  • Fix find for sibling nodes when the precedent selector match an element at the end of sibling list - fix issue #39

0.6.1 - 2015-10-11

Fixed

0.6.0 - 2015-10-07

Added

0.5.0 - 2015-09-27

Added

0.4.1 - 2015-09-18

Fixed

  • Ignoring other files that are not lexer files (".xrl") under src/ directory in Hex package. This fixes a crash when compiling using OTP 17.5 on Mac OS X. Huge thanks to @henrik and @licyeus that pointed the issue!

0.4.0 - 2015-09-17

Added

  • A robust representation of selectors in order to enable queries using a mix of selector types, such as classes with attributes, attributes with types, classes with classes and so on. Here is a list with examples of what is possible now:
    • Floki.find(html, "a.foo")
    • Floki.find(html, "a.foo[data-action=post]")
    • Floki.find(html, ".foo.bar")
    • Floki.find(html, "a.foo[href$='.org']") Thanks to @licyeus to point out the issue!
  • Include mochiweb in the applications list at mix.exs - thanks @EricDykstra

Changed

  • Floki.find/2 will now return a list instead of tuple when searching only by IDs. For now on, Floki should always return the results inside a list, even if it's an ID match.

Removed

  • Floki.find/2 does not accept tuples as selectors anymore. This is because with the robust selectors representation, it won't be necessary to query directly using tuples or another data structures rather than string.

0.3.3 - 2015-08-23

Fixed

0.3.2 - 2015-06-27

Fixed

  • Fix Floki.DeepText when there is a comment inside nodes.

0.3.1 - 2015-06-21

Fixed

0.3.0 - 2015-06-07

Added

  • Add attribute equals selector. This feature enables the user to search using HTML attributes other than "class" or "id". E.g: Floki.find(html, "[data-model=user]") - @nelsonr

0.2.1 - 2015-06-04

Fixed

  • Fix parse/1 when parsing a part of HTML without a root node - @antonmi

0.2.0 - 2015-05-03

Added

  • Support HTML string when searching for attributes with Floki.attribute/2.
  • Option for Floki.text/2 to disable deep search and use flat search instead.

Changed

  • Change Floki.text/1 to perform a deep search of text nodes.
  • Consider doctests in the test suite.

0.1.1 - 2015-03-25

Added

Changed

  • Using MochiWeb as a hex dependency instead of embedded code. It closes the issue #5

0.1.0 - 2015-02-15

Added

  • Descendant selectors, like ".class tag" to Floki.find/2.
  • Multiple selection, like ".class1, .class2" to Floki.find/2.

0.0.5 - 2014-12-21

Added

  • Floki.text/1, which returns all text in the same level of the parent element inside HTML.

Changed

  • Elixir version requirement from "~> 1.0.0" to ">= 1.0.0".