Crawly.RequestsStorage.Worker (Crawly v0.13.0) View Source
Requests Storage, is a module responsible for storing requests for a given spider.
Automatically filters out already seen requests (uses fingerprints
approach
to detect already visited pages).
Pipes all requests through a list of middlewares, which do pre-processing of all requests before storing them
Link to this section Summary
Functions
Returns a specification to start this module under a supervisor.
Callback implementation for GenServer.init/1
.
Pop a request out of requests storage
Get statistics from the requests storage
Store individual request request
Link to this section Functions
Returns a specification to start this module under a supervisor.
See Supervisor
.
Callback implementation for GenServer.init/1
.
Specs
pop(pid()) :: Crawly.Request.t() | nil
Pop a request out of requests storage
Specs
stats(pid()) :: {:stored_requests, non_neg_integer()}
Get statistics from the requests storage
Specs
store(spider_name, requests) :: :ok when spider_name: atom(), requests: [Crawly.Request.t()]
store(spider_name, request) :: :ok when spider_name: atom(), request: Crawly.Request.t()
Store individual request request