Clean html tree for prepare candidates. It transforms misused tags and removes unlikely candidates.
Remove unlikely html tree
Transform misused divs
s
html_tree :: tuple | list
remove_unlikely_tree(html_tree) :: html_tree
transform_misused_div_to_p(html_tree) :: html_tree