Crawly.RequestsStorage.Worker (Crawly v0.17.2) View Source
Requests Storage, is a module responsible for storing requests for a given spider.
Automatically filters out already seen requests (uses fingerprints
approach
to detect already visited pages).
Pipes all requests through a list of middlewares, which do pre-processing of all requests before storing them
Link to this section Summary
Functions
Returns a specification to start this module under a supervisor.
Callback implementation for GenServer.init/1
.
Pop a request out of requests storage
Returns all scheduled requests (used for some sort of preview)
Get statistics from the requests storage
Store individual request or multiple requests
Link to this section Functions
Returns a specification to start this module under a supervisor.
See Supervisor
.
Callback implementation for GenServer.init/1
.
Specs
pop(pid()) :: Crawly.Request.t() | nil
Pop a request out of requests storage
Specs
requests(pid()) :: {:requests, [Crawly.Request.t()]}
Returns all scheduled requests (used for some sort of preview)
Specs
stats(pid()) :: {:stored_requests, non_neg_integer()}
Get statistics from the requests storage
Specs
store(Crawly.spider(), Crawly.Request.t() | [Crawly.Request.t()]) :: :ok
Store individual request or multiple requests