Crawly.RequestsStorage.Worker (Crawly v0.16.0)

Requests Storage, is a module responsible for storing requests for a given spider.

Automatically filters out already seen requests (uses fingerprints approach to detect already visited pages).

Pipes all requests through a list of middlewares, which do pre-processing of all requests before storing them

Returns a specification to start this module under a supervisor.

See Supervisor.

Callback implementation for GenServer.init/1.


pop(pid()) :: Crawly.Request.t() | nil

Pop a request out of requests storage


requests(pid()) :: {:requests, [Crawly.Request.t()]}

Returns all scheduled requests (used for some sort of preview)

start_link(spider_name, crawl_id)

stats(pid()) :: {:stored_requests, non_neg_integer()}

Get statistics from the requests storage


Store individual request or multiple requests