Crawly.Spider behaviour (Crawly v0.17.2) View Source
A behavior module for implementing a Crawly Spider
A Spider is a module which is responsible for defining:
init/0
function, which must return a keyword list with start_urls/start_requests listinit/1
same as init, but also takes a list of options sent from Enginebase_url/0
function responsible for filtering out requests not related to a given websiteoverride_settings/0
function that is called each time a setting is referenced internally. Allows overriding of Crawly configuration at the spider-level.parse_item/1
function which is responsible for parsing the downloaded request and converting it into items which can be stored and new requests which can be scheduledcustom_settings/0
an optional callback which can be used in order to provide custom spider specific settings. Should define a list with custom settings and their values. These values will take precedence over the global settings defined in the config.
Link to this section Summary
Link to this section Callbacks
Specs
base_url() :: binary()
Specs
Specs
Specs
override_settings() :: Crawly.Settings.t()
Specs
parse_item(response :: HTTPoison.Response.t()) :: Crawly.ParsedItem.t()