View Source treewalker

A web crawler in Erlang that respects robots.txt.

installation
Installation

This library is available on hex.pm.

Keep in mind that the library is not yet stable and its API may be subject to changes.

usage
Usage

%% This will add the specified crawler to the supervision tree
{ok, _} = treewalker:add_crawler(example, #{scraper => example_scraper,
                                            fetcher => example_fetcher,
                                            max_depth => 3,
                                            link_filter => example_link_filter,
                                            store => example_store}),
%% Starts crawling
ok = treewalker:start_crawler(example),
%% ...
%% Stops the crawler
%% The pending requests will be completed and dropped
ok = treewalker:stop_crawler(example),

options
Options

The following settings are available via the sys.config configuration:

{treewalker, [
              %% The minimum delay to wait before retrying a failed request
              {min_retry_delay, pos_integer()},
              %% The maximum delay to wait before retrying a failed request
              {max_retry_delay, pos_integer()},
              %% The maximum amount of retries of a failed request
              {max_retries, pos_integer()},
              %% The maximum amount of delay before starting a request (in seconds)
              {max_worker_delay, pos_integer()},
              %% The maximum amount of concurrent workers making HTTP requests
              {max_concurrent_worker, pos_integer()},
              %% The user agent making the HTTP requests
              {user_agent, binary()}]},

development
Development

running-all-the-tests-and-linters
Running all the tests and linters

You can run all the tests and linters with the rebar3 alias:

rebar3 check

← Previous Page API Reference

Next Page → License

Settings View Source treewalker

installation Installation

usage Usage

options Options

development Development

running-all-the-tests-and-linters Running all the tests and linters