Crawler v0.2.0 Crawler.Parser
Parses pages and calls a link handler to handle the detected links.
Link to this section Summary
Link to this section Functions
Link to this function
mark_processed(arg1)
Link to this function
parse(page, link_handler \\ &(Dispatcher.dispatch(&1, &2)))
Examples
iex> Parser.parse(%{page: %Page{body: "Body"}, opts: []})
%Page{body: "Body"}
iex> Parser.parse(%{page: %Page{
iex> body: "<a href='http://parser/1'>Link</a>"
iex> }, opts: []})
%Page{body: "<a href='http://parser/1'>Link</a>"}
iex> Parser.parse(%{page: %Page{
iex> body: "<a name='hello'>Link</a>"
iex> }, opts: []})
%Page{body: "<a name='hello'>Link</a>"}
iex> Parser.parse(%{page: %Page{
iex> body: "<a href='http://parser/2' target='_blank'>Link</a>"
iex> }, opts: []})
%Page{body: "<a href='http://parser/2' target='_blank'>Link</a>"}
iex> Parser.parse(%{page: %Page{
iex> body: "<a href='parser/2'>Link</a>"
iex> }, opts: [referrer_url: "http://hello/"]})
%Page{body: "<a href='parser/2'>Link</a>"}
iex> Parser.parse(%{page: %Page{
iex> body: "<a href='../parser/2'>Link</a>"
iex> }, opts: [referrer_url: "http://hello/"]})
%Page{body: "<a href='../parser/2'>Link</a>"}
Link to this function
parse_links(body, opts, link_handler)