Crawly.Middlewares.SameDomainFilter (Crawly v0.17.2) View Source
Filters out requests which are going outside of the crawled domain.
The domain that is used to compare against the request url is obtained from the previous response, so it ends up being the spider's start_url. Spider's base_url is not evaluated.
Does not accept any options. Tuple-based configuration optionswill be ignored.
Example Declaration
middlewares: [
Crawly.Middlewares.SameDomainFilter
]
Link to this section Summary
Functions
Callback implementation for Crawly.Pipeline.run/3
.
Link to this section Functions
Callback implementation for Crawly.Pipeline.run/3
.