Crawler v0.2.0 API Reference

Modules

A high performance web crawler in Elixir

Dispatches requests to a queue for crawling

A worker that performs the crawling

Fetches pages and perform tasks on them

Checks a series of conditions to determine whether it is okay to continue, i.e. to allow Crawler.Fetcher.fetch/1 to begin its tasks

Records information about each crawl for internal use

Builds a path for a link (can be a URL itself or a relative link) based on the input string which is a URL with or without its protocol

Expands the path by expanding any . and .. characters

Finds different components of a given URL, e.g. its domain name, directory path, or full path

Transforms a link to be storeable and linkable offline

Returns prefixes (“../“) according to the given URL’s structure

Options for the crawler

Parses pages and calls a link handler to handle the detected links

Handles the queueing of crawl requests

Stores crawled pages offline

Replaces links found in a page so they work offline

An internal data store for information related to each crawl

Starts the crawl tasks

A supervisor for dynamically starting workers