Crawly.RequestsStorage (Crawly v0.12.0) View Source
Request storage, a module responsible for storing urls for crawling
           ┌──────────────────┐
           │                  │             ┌------------------┐
           │ RequestsStorage  <─────────────┤ From crawlers1,2 │
           │                  │             └------------------┘
           └─────────┬────────┘
                     │
                     │
                     │
                     │
        ┌────────────▼─────────────────┐
        │                              │
        │                              │
        │                              │┌───────────▼──────────┐ ┌───────────▼──────────┐ │RequestsStorageWorker1│ │RequestsStorageWorker2│ │ (Crawler1) │ │ (Crawler2) │ └──────────────────────┘ └──────────────────────┘
All requests are going through one RequestsStorage process, which quickly finds the actual worker, which finally stores the request afterwords.
Link to this section Summary
Functions
Returns a specification to start this module under a supervisor.
Callback implementation for GenServer.init/1.
Pop a request out of requests storage
Starts a worker for a given spider
Get statistics from the requests storage
Store request in related child worker
Link to this section Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
Callback implementation for GenServer.init/1.
Specs
pop(spider_name) :: result when spider_name: atom(), result: nil | Crawly.Request.t() | {:error, :storage_worker_not_running}
Pop a request out of requests storage
Specs
start_worker(spider_name) :: result when spider_name: atom(), result: {:ok, pid()} | {:error, :already_started}
Starts a worker for a given spider
Specs
stats(spider_name) :: result when spider_name: atom(), result: {:stored_requests, non_neg_integer()} | {:error, :storage_worker_not_running}
Get statistics from the requests storage
Specs
store(spider_name, requests) :: result when spider_name: atom(), requests: [Crawly.Request.t()], result: :ok | {:error, :storage_worker_not_running}
store(spider_name, request) :: :ok when spider_name: atom(), request: Crawly.Request.t()
Store request in related child worker
