Summary

Types

map_key()

post_processing()

result()

selector_tuple()

Functions

crawl(link, clients \\ [])

Request link and returns the raw content.

parse(mapping, raw_content)

Request link and returns the raw content.

Types

map_key()

@type map_key() :: String.t() | atom()

post_processing()

@type post_processing() :: atom() | {module(), atom()} | (any() -> String.t())

result()

@type result() :: :ok | :error

selector_tuple()

@type selector_tuple() :: {String.t(), post_processing()}

Functions

crawl(link, clients \\ [])

@spec crawl(String.t(), [ExCrawlzy.BrowserClients.client()]) :: {result(), String.t()}

Request link and returns the raw content.

Examples

iex> ExCrawlzy.crawl("http://some.site")
{:ok, "<!doctype html><html>  <head>    <title>the title</title>  </head>  <body>    <div id=\"the_body\">      the body      <div id=\"inner_field\">        inner field      </div>      <div id=\"inner_second_field\">        inner second field        <div id=\"the_number\">          2023        </div>      </div>      <div id=\"exist\">        this field exist      </div>      <a class=\"link_class\" href=\"http://some_external.link\"></a>      <img class=\"img_class\" src=\"http://some_external.link/image_path.jpg\" alt=\"some image\">    </div>  </body></html>"}

parse(mapping, raw_content)

@spec parse(
  %{required(map_key()) => selector_tuple()},
  String.t() | Floki.html_tree() | Floki.html_node()
) :: {result(), %{required(map_key()) => String.t()}}

Request link and returns the raw content.

Examples

iex> raw_content = "<html><head><title>the title</title></head><body><div id=\"the_body\">the body</div></body></html>"
iex> ExCrawlzy.parse(%{body: {"#the_body", :text}}, raw_content)
{:ok, %{body: "the body"}}

Settings View Source ExCrawlzy (ExCrawlzy v0.1.1)

Summary

Types

Functions

Types

map_key()

post_processing()

result()

selector_tuple()

Functions

crawl(link, clients \\ [])

Examples

parse(mapping, raw_content)

Examples

View Source ExCrawlzy (ExCrawlzy v0.1.1)