Crawly.Middlewares.SameDomainFilter (Crawly v0.17.0) View Source

Filters out requests which are going outside of the crawled domain.

The domain that is used to compare against the request url is obtained from the previous response, so it ends up being the spider's start_url. Spider's base_url is not evaluated.

Does not accept any options. Tuple-based configuration optionswill be ignored.

Example Declaration

middlewares: [
  Crawly.Middlewares.SameDomainFilter
]

Link to this section Summary

Link to this section Functions

Link to this function

run(request, state, opts \\ [])

View Source

Callback implementation for Crawly.Pipeline.run/3.