View Source Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Unreleased
0.36.3 - 2024-10-21
This release contains some performance improvements, thanks to @ypconstante.
Fixed
Stop
Floki.get_by_id/2
traversal on first match. Thanks @ypconstante.Remove extra whitespaces from nodes without attributes on
Floki.raw_html/1
. Thank you @ypconstante.Fix
Floki.raw_html/1
typespecs. Thanks @davydog187.
0.36.2 - 2024-04-26
Added
- Implement the
Inspect
protocol for theFloki.HTMLTree
struct. This struct is currently private. Thank you @vittoriabitton.
Fixed
Fix regression to respect config option
:encode
inFloki.raw_html/2
. Thanks @Sgoettschkes.Make the
Floki.raw_html/2
treat the contents of the<title>
tag as plain text. The idea is to align withparse_document/2
. Thank you @aymanosman.
0.36.1 - 2024-03-18
Fixed
- Fix typespec of
get_by_id/2
.
0.36.0 - 2024-03-01
Added
- Add
Floki.get_by_id/1
that returns one element by ID ornil
. Thanks @SteffenDE.
Changed
- Improve options validation with
Keyword.validate!/2
. This is not a change in APIs, but the error messages and opts validation should be standardized now. Thanks @vittoriabitton.
Removed
- Drop support for Elixir v1.12.
0.35.4 - 2024-02-19
Besides the fix described below, this release also contains more performance improvements, thanks to @ypconstante.
Fixed
- Fix order of results for
Floki.find/2
. This was a regression from the previous version - thanks @ypconstante.
0.35.3 - 2024-01-25
This release has great performance improvements, thanks to the PRs from @ypconstante!
Most of the main functions, such as Floki.raw_html/2
and Floki.find/2
are
faster and are using less memory. It's something like twice as fast, and half
usage of memory for find/2
, for example.
Fixed
Add
:leex
to Mix compilers. Fixes the build when running with dev version of Elixir. Thanks @wojtekmach.Fix
Floki.raw_html/2
when a tree using attributes as maps is given. Thanks @SupaMic.Add a guard to
Floki.find/2
so people can have a better error message when an invalid input is given. Thanks @Hajto.Fix parsers to consider IO data as inputs. This may change in the next version of Floki, as I plan to drop support for IO data. Thanks @ypconstante.
Removed
- Remove outdated Gleam wrapper code. The external functions syntax in Gleam has changed. So now the wrapper is not needed anymore. Thanks @michallepicki.
0.35.2 - 2023-10-25
Fixed
- Enable usage of IO data by removing a guard for binaries in the main parser module.
0.35.1 - 2023-10-16
Fixed
- Fix a small regression of the mochiweb parser that was breaking when a malformed HTML was used. For more details, see the original issue: https://github.com/philss/floki/issues/492
0.35.0 - 2023-10-13
Added
Add support for parsing attributes as maps.
This makes
parse_document/2
andparse_fragment/2
accept the option:attributes_as_maps
to change the behaviour and return attributes as maps instead of lists of tuples. The only parser that does not support it yet is thefast_html
.
Changed
Drop support for Elixir v1.11.
Change the log level of parsing logger calls from "info" to "debug". This will help to reduce the amount of noise in production apps.
0.34.3 - 2023-06-02
Added
- Add boolean option
:include_inputs
toFloki.text/2
that changes the result of this function to include the values of inputs. So if there is any input with a "value" attribute, we now include that value if this option is set totrue
. Thanks @viniciusmuller.
Fixed
Fix find of elements by classes that contain colons. This is useful for when people are trying to find elements that contain Tailwind classes. Thanks @viniciusmuller.
Fix some typespecs that were using types from private modules. This is a fix to the documentation.
0.34.2 - 2023-02-24
Added
Add option to pass down arguments to the parser in
Floki.parse_document/2
andFloki.parse_fragment/2
. Thanks @Kuret.Add support for returning more elements from the
Floki.traverse_and_update/2
function callback. This enables the creation of more elements in the tree, but should be used with care, since the tree can grow a lot if the change is not controlled. Thanks @martosaur.
0.34.1 - 2023-02-11
Fixed
Fix pseudo-class ":not" selector parsing halting point. This is a fix for when a "pseudo-class" ":not" that contains an attribute selector is followed by another selector. This is an example: "a:not([class]), div".
Ignore decimal numeric char ref when number is negative.
0.34.0 - 2022-11-03
Added
- User configurable "self-closing" tags. Now it's possible to define which tags are considered "self-closing". Thanks @inoas.
Fixed
- Allow attribute values to not be escaped. This fixes
Floki.raw_html/2
when used with the optionencode: false
. Thanks @juanazam. - Fix
traverse_and_update/3
spec. Thanks @WLSF.
Changed
- Drop support for Elixir 1.9 and 1.10.
- Remove
html_entities
dependency. We now use an internal encoder/decoder for entities. - Change the main branch name to
main
.
0.33.1 - 2022-06-28
Fixed
- Remove some warnings for unused code.
0.33.0 - 2022-06-28
Added
- Add support for searching elements that contains text in a case-insensitive manner with
fl-icontains
- thanks @nuno84
Changed
- Drop support for Elixir 1.8 and 1.9.
- Fix and improve internal things - thanks @derek-zhou and @hissssst
0.32.1 - 2022-03-24
Fixed
- Allow root nodes to be selected using pseudo-classes - thanks @rzane
0.32.0 - 2021-10-18
Added
- Add an HTML tokenizer written in Elixir - this still experimental and it's not stable API yet.
- Add support for HTML IDs containing periods in the selectors - thanks @Hugo-Hache
- Add support for case-insensitive CSS attribute selectors - thanks @fcapovilla
- Add the
:root
pseudo-class selector - thanks @fcapovilla
0.31.0 - 2021-06-11
Changed
- Treat
style
andtitle
tags as plaintext in Mochiweb - thanks @SweetMNM
0.30.1 - 2021-03-29
Fixed
- Fix typespecs of
Floki.traverse_and_update/2
to make clear that it does not accept text nodes directly.
0.30.0 - 2021-02-06
Added
- Add ":disabled" pseudo selector - thanks @vnegrisolo
- Add Gleam adapter - thanks @CrowdHailer
- Add pretty option to
Floki.raw_html/2
- thanks @evaldobratti - Add
html_parser
option toparse_
functions. This enables a more dynamic and functional configuration of the HTML parser in use.
Changed
- Remove support for Elixir 1.7 - thanks @carlosfrodrigues
- Replace
IO.warn
byLogger.info
for deprecation warnings - thanks @juulSme
Fixed
- Fix typespecs for
find
,attr
andattribute
functions - thanks @mtarnovan - Documentation Improvements - thanks @kianmeng
0.29.0 - 2020-10-02
Added
- Add
Floki.find_and_update/3
that updates nodes inside a tree, like traverse and update but without allowing changes in the children nodes. There for the tree cannot grow in size, but can have nodes removed.
Changed
- Deprecate
Floki.map/2
because we have nowFloki.find_and_update/3
andFloki.traverse_and_update/2
that are powerful APIs.Floki.map/2
can be replaced byEnum.map/2
as well - thanks @josevalim for the idea! - Update optional dependency
fast_html
tov2.0.4
Fixed
- Fix a bug when parsing a HTML with a XML inside using Mochiweb's parser
Improvements
- Add more typespecs
0.28.0 - 2020-08-26
Added
- Add support for
:checked
pseudo-class selector - thanks @wojtekmach
Changed
- Drop support for Elixir 1.6
- Update version of
fast_html
to 2.0 in docs and CI - thanks @rinpatch
Fixed
- Fix docs by mentioning HTML nodes supported for
traverse_and_update
- thanks @hubertlepicki
0.27.0 - 2020-07-07
Added
Floki.filter_out/2
now can filter text nodes - thanks @ckruse- Support more encoding entities in
Floki.raw_html/1
- thanks @ntenczar
Fixed
- Fix
Floki.attribute/2
when there is only text nodes in the document - thanks @ckruse
Improvements
- Performance improvements of
Floki.raw_html/1
function - thanks @josevalim - Improvements in the docs and specs of
Floki.traverse_and_update/2
andFloki.children/1
- thanks @josevalim - Improvements in the spec of
Floki.traverse_and_update/2
- thanks @Dalgona - Improve the CI setup to run the formatter correctly - thanks @Cleidiano
0.26.0 - 2020-02-17
Added
- Add support for the pseudo-class selectors
:nth-last-child
and:nth-last-of-type
Fixed
- Fix the typespecs of
Floki.traverse_and_update/3
- thanks @RichMorin
Changed
- Update optional dependency
fast_html
tov1.0.3
0.25.0 - 2020-01-26
Added
- Add
Floki.parse_fragment!/1
andFloki.parse_document!/1
that has the same functionality of the functions without the bang, but they return the document or fragment without the either tuple and will raise exception in case of errors - thanks @schneiderderek - Add
Floki.traverse_and_update/3
which accepts an accumulator which is useful to keep the state while traversing the HTML tree - thanks @Dalgona
Changed
- Update the
html_entities
dependency fromv0.5.0
tov0.5.1
0.24.0 - 2020-01-01
Added
- Add support for
fast_html
, which is a "C Node" wrapping Lexborisov's myhtml - thanks @rinpatch - Add setup to run our test suite against all parsers on CI - thanks @rinpatch
- Add
Floki.parse_document/1
andFloki.parse_fragment/1
in order to correct parse documents and fragments of documents - it also prevents the confusion and inconsistency ofparse/1
. - Configure
dialyxir
in order to run Dialyzer easily.
Changed
- Deprecate
Floki.parse/1
and all the functions that uses it underneath. This means that all the functions that accepted HTML as binary are deprecated as well. This includesfind/2
,attr/4
,filter_out/2
,text/2
andattribute/2
. The recommendation is to use those functions with an already parsed document or fragment. - Remove support for
Elixir 1.5
.
0.23.1 - 2019-12-01
Fixed
- It fixes the Mochiweb parser when there is an invalid charref.
0.23.0 - 2019-09-11
Changed
- Remove Mochiweb as a hex dependency. It brings the code from the original project to Floki's codebase - thanks @josevalim
0.22.0 - 2019-08-21
Added
- Add
Floki.traverse_and_update/2
that works in similar way toFloki.map/2
but traverse the tree and update the children elements. The difference from "map" is that this function can create a tree with more or less nodes. - thanks @ericlathrop
Changed
- Remove support for Elixir 1.4.
0.21.0 - 2019-04-17
Added
- Add a possibility to filter
style
tags onFloki.text/2
- thanks @Vict0rynox
Fixed
- Fix
Floki.text/2
to consider the previous filter ofjs
when filteringstyle
- thanks @Vict0rynox - Fix typespecs for
Floki.filter_out/2
- thanks @myfreeweb
Changed
- Drop support for Elixir 1.3 and below - thanks @herbstrith
0.20.4 - 2018-09-24
Fixed
- Fix
Floki.raw_html
to accept lists as attribute values - thanks @katehedgpeth
0.20.3 - 2018-06-22
Fixed
- Fix style and script tags with comments - thanks @francois2metz
0.20.2 - 2018-05-09
Fixed
- Fix
Floki.raw_html/1
to correct handle quotes and double quotes on attributes - thanks @grych
0.20.1 - 2018-04-05
Fixed
- Remove
Enumerable.slice/1
compile warning forFloki.HTMLTree
- thanks @thecodeboss - Fix
Floki.find/2
that was failing on HTML that consists entirely of a comment - thanks @ShaneWilton
0.20.0 - 2018-02-06
Added
- Configurable raw_html/2 to allow optional encode of HTML entities - thanks @davydog187
Fixed
- Fix serialization of the tree after updating attribute - thanks @francois2metz
0.19.3 - 2018-01-25
Fixed
- Skip HTML entities encode for
Floki.raw_html/1
forscript
orstyle
tags - Add
:html_entities
app to the list of OTP applications. It fixes production releases.
0.19.2 - 2017-12-22
Fixed
- (BREAKING CHANGE) Re-encode HTML entities on
Floki.raw_html/1
.
0.19.1 - 2017-12-04
Fixed
- Fixed doctype serialization for
Floki.raw_html/1
- thanks [@jhchen][https://github.com/jhchen]
0.19.0 - 2017-11-11
Added
- Added support for
nth-of-type
,first-of-type
,last-of-type
andlast-child
pseudo-classes - thanks @saleem1337. - Added support for
nth-child
pseudo-class functional notation - thanks @nirev. - Added functional notation support for
nth-of-type
pseudo-class. - Added a Contributing guide.
Fixed
- Format all files according to the Elixir 1.6 formatter - thanks @fcevado.
- Fix
Floki.raw_html
to support raw text - thanks @craig-day.
0.18.1 - 2017-10-13
Added
- Added a Code of Conduct.
Fixed
- Fix XML tag when building HTML tree.
- Return empty list when
Floki.filter_out/2
result is empty.
0.18.0 - 2017-08-05
Added
- Added
Floki.attr/4
that receives a function enabling manipulation of attribute values - thanks @erikdsi. - Implement the String.Chars protocol for Floki.Selector.
- Implement the Enumerable protocol for Floki.HTMLTree.
Changed
- Changed
Floki.transform/2
toFloki.map/2
andFloki.Finder.apply_transform/2
toFloki.Finder.map/2
- thanks @aphillipo.
Fixed
- Fix
Floki.raw_html/1
to consider XML prefixes - thanks @sergey-kintsel. - Fix
raw_html
for self closing tags with content - thanks @navinpeiris.
Removed
- Removed support for Elixir 1.2.
0.17.2 - 2017-05-25
Fixed
- Fix attribute selectors in :not() - thanks @jjcarstens and @Eiji7
- Fix selector parser to consider combinators across selectors separated by commas. For further details, please check the pull request - thanks @jjcarstens and @mischov
0.17.1 - 2017-05-22
Fixed
- Fix search when body has unencoded angles (
<
and>
) - thanks @sergey-kintsel - Fix crash caused by XML declaration inside body - thanks @erikdsi
- Fix issue when finding fails if HTML begins with XML tag - thanks @sergey-kintsel
0.17.0 - 2017-04-12
Added
- Add support for multiple pseudo-selectors, line :not() and :nth-child() - thanks @jjcarstens
- Add support for multiple selectors inside the :not() pseudo-class selector - thanks @jjcarstens
0.16.0 - 2017-04-05
Added
- Add support for selectors that only include a pseudo-class selector - thanks @buhman
- Add support for a new selector:
fl-contains
, which returns elements that contains a given text - thanks @buhman
Fixed
- Fix
:not()
pseudo-class selector to accept simple pseudo-class selectors as well - thanks @mischov
0.15.0 - 2017-03-14
Added
- Added support for the
:not()
pseudo-class selector.
Fixed
- Fixed pseudo-class selectors that are used in conjunction with combinators - thanks @Eiji7
- Fixed order of elements after search using descendant combinator - thanks @Eiji7
0.14.0 - 2017-02-07
Added
- Added support for configuring
html5ever
as the HTML parser. Issue #83 - thanks @hansihe and @aphillipo!
0.13.2 - 2017-02-07
Fixed
- Fixed bug that was causing Floki.text/1 and Floki.filter_out/2 to ignore "trees" with only text nodes. Issue #91 - thanks @boydm.
0.13.1 - 2017-01-22
Fixed
- Fix ordering of duplicated descendant matches - thanks @mmmries
- Fix ordering of
Floki.text/1
when there are only root nodes - thanks @mmmries
0.13.0 - 2017-01-22
Added
- Floki.filter_out/2 is now able to understand complex selectors to filter out from the tree.
0.12.1 - 2017-01-20
Fixed
- Fix search for elements using descendant combinator - issue #84 - thanks @mmmries
0.12.0 - 2016-12-28
Added
- Add basic support for nth-child pseudo-class selector. Closes issue #64.
Changed
- Remove support for Elixir 1.1 and below.
- Remove public documentation for internal code.
0.11.0 - 2016-10-12
Added
- First attempt to transform nodes with
Floki.transform/2
. It is not able to update the tree yet, but works good with results fromFloki.find/2
- thanks @bobjflong
Changed
- Using Logger to notify unknown tokens in selector parser - thanks @teamon and @geonnave
- Replace
mochiweb_html
withmochiweb
package. This is needed to fix conflict with other packages that are usingmochiweb
. - thanks @aphillipo
0.10.1 - 2016-08-28
Fixed
- Fix sibling search after immediate children - thanks @gmile.
0.10.0 - 2016-08-05
Changed
- Change the search for namespaced elements using the correct CSS3 syntax.
Fixed
- Fix the search for child elements when is more than two elements deep - thanks @gmile
0.9.0 - 2016-06-16
Added
- A separator between text when getting text from nodes - thanks @rochdi.
0.8.1 - 2016-05-20
Added
- Support rendering boolean attributes on
Floki.raw_html/1
- thanks @iamvery.
Changed
- Update Mochiweb HTML parser dependency to version 2.15.0.
0.8.0 - 2016-03-06
Added
- Add possibility to search tags with namespaces.
- Accept
Floki.Selector
as parameter ofFloki.find/2
instead of only strings - thanks @hansihe.
Changed
- Using a smaller package with only the Mochiweb HTML parser.
0.7.2 - 2016-02-23
Fixed
- Replace
<br>
nodes by newline (\n
) inDeepText
- thanks @maxneuvians. - Allow
FilterOut
to filter special nodes, likecomment
.
0.7.1 - 2015-11-14
Fixed
- Ignore PHP scripts when finding nodes.
0.7.0 - 2015-11-03
Added
- Add support for excluding script notes in
Floki.text
. By default, it will exclude those nodes, but it can be enabled with the flagjs: true
- thanks @vikeri!
Fixed
- Fix find for sibling nodes when the precedent selector match an element at the end of sibling list - fix issue #39
0.6.1 - 2015-10-11
Fixed
- Fix the
Floki.raw_html/1
to build HTML comments properly.
0.6.0 - 2015-10-07
Added
- Add
Floki.raw_html/2
.
0.5.0 - 2015-09-27
Added
- Add the child combinator to
Floki.find/2
. - Add the adjacent sibling combinator to
Floki.find/2
. - Add the general adjacent sibling combinator to
Floki.find/2
.
0.4.1 - 2015-09-18
Fixed
- Ignoring other files that are not lexer files (".xrl") under
src/
directory in Hex package. This fixes a crash when compiling using OTP 17.5 on Mac OS X. Huge thanks to @henrik and @licyeus that pointed the issue!
0.4.0 - 2015-09-17
Added
- A robust representation of selectors in order to enable queries using a mix of selector types, such as classes with attributes, attributes with types, classes with classes and so on. Here is a list with examples of what is possible now:
- Include
mochiweb
in the applications list at mix.exs - thanks @EricDykstra
Changed
Floki.find/2
will now return a list instead of tuple when searching only by IDs. For now on, Floki should always return the results inside a list, even if it's an ID match.
Removed
Floki.find/2
does not accept tuples as selectors anymore. This is because with the robust selectors representation, it won't be necessary to query directly using tuples or another data structures rather than string.
0.3.3 - 2015-08-23
Fixed
- Fix
Floki.find/2
when there is a non-HTML input. It closes the issue #17
0.3.2 - 2015-06-27
Fixed
- Fix
Floki.DeepText
when there is a comment inside nodes.
0.3.1 - 2015-06-21
Fixed
- Fix
Floki.find/2
to consider XML trees.
0.3.0 - 2015-06-07
Added
- Add attribute equals selector. This feature enables the user to search using
HTML attributes other than "class" or "id".
E.g:
Floki.find(html, "[data-model=user]")
- @nelsonr
0.2.1 - 2015-06-04
Fixed
- Fix
parse/1
when parsing a part of HTML without a root node - @antonmi
0.2.0 - 2015-05-03
Added
- Support HTML string when searching for attributes with
Floki.attribute/2
. - Option for
Floki.text/2
to disable deep search and use flat search instead.
Changed
- Change
Floki.text/1
to perform a deep search of text nodes. - Consider doctests in the test suite.
0.1.1 - 2015-03-25
Added
- Add CHANGELOG.md following the Keep a changelog.
Changed
- Using MochiWeb as a hex dependency instead of embedded code. It closes the issue #5
0.1.0 - 2015-02-15
Added
- Descendant selectors, like ".class tag" to Floki.find/2.
- Multiple selection, like ".class1, .class2" to Floki.find/2.
0.0.5 - 2014-12-21
Added
Floki.text/1
, which returns all text in the same level of the parent element inside HTML.
Changed
- Elixir version requirement from "~> 1.0.0" to ">= 1.0.0".