Crawly.Middlewares.RobotsTxt (Crawly v0.17.0) View Source

Obey robots.txt

A robots.txt file tells search engine crawlers which pages or files the crawler can or can't request from your site. This is used mainly to avoid overloading a site with requests!

No options are required for this middleware. Any tuple-based configurations options passed will be ignored.

Example Declaration

middlewares: [
  Crawly.Middlewares.RobotsTxt
]

Link to this section Summary

Link to this section Functions

Link to this function

run(request, state, opts \\ [])

View Source

Callback implementation for Crawly.Pipeline.run/3.