Crawly.Fetchers.Splash (Crawly v0.17.2) View Source

Implements Crawly.Fetchers.Fetcher behavior for Splash Javascript rendering.

Splash is a lightweight QT based Javascript rendering engine. See: https://splash.readthedocs.io/

Splash exposes the render.html endpoint which renders incoming requests sent with ?url get parameter.

This particular Splash fetcher converts all requests made by Crawly to Splash requests, and cleans up the final responses, by removing the Splash parts from the response.

It's possible to start splash server in any documented way. One of the options is to run it locally with a help of docker: docker run -it -p 8050:8050 scrapinghub/splash

In this case you have to configure the fetcher in the following way: fetcher: {Crawly.Fetchers.Splash, [base_url: "http://localhost:8050/render.html"]},

Link to this section Summary

Functions

fetch(request, client_options)

Callback implementation for Crawly.Fetchers.Fetcher.fetch/2.

Link to this section Functions

fetch(request, client_options)

Specs

fetch(request, client_options) :: response
when request: Crawly.Request.t(),
     client_options: [binary()],
     response: Crawly.Response.t()

Callback implementation for Crawly.Fetchers.Fetcher.fetch/2.