Crawler v0.2.0 API Reference
Modules
A high performance web crawler in Elixir
Dispatches requests to a queue for crawling
A worker that performs the crawling
Fetches pages and perform tasks on them
Checks a series of conditions to determine whether it is okay to continue,
i.e. to allow Crawler.Fetcher.fetch/1 to begin its tasks
Records information about each crawl for internal use
Builds a path for a link (can be a URL itself or a relative link) based on the input string which is a URL with or without its protocol
Expands the path by expanding any . and .. characters
Finds different components of a given URL, e.g. its domain name, directory path, or full path
Transforms a link to be storeable and linkable offline
Returns prefixes (“../“) according to the given URL’s structure
Options for the crawler
Parses pages and calls a link handler to handle the detected links
Handles the queueing of crawl requests
Stores crawled pages offline
Replaces links found in a page so they work offline
An internal data store for information related to each crawl
Starts the crawl tasks
A supervisor for dynamically starting workers