Chrobot Extra

⛭ Typed browser automation for the BEAM ⛭

About

Chrobot Extra provides a set of typed bindings to the stable version of the Chrome Devtools Protocol, based on its published JSON specification.

The typed interface is achieved by generating Gleam code for type definitions as well as encoder / decoder functions from the parsed JSON specification file.

Chrobot Extra also exposes some handy high level abstractions for browser automation, and handles managing a browser instance via an Erlang Port and communicating with it for you.

You could use it for

Generating PDFs from HTML
Web scraping
Web archiving
Browser integration tests

🦝 The generated protocol bindings are largely untested and I would consider this package experimental, use at your own peril!

Setup

Package

Install as a Gleam package

gleam add chrobot_extra@4

Install as an Elixir dependency with mix

# in your mix.exs
defp deps do
  [
    {:chrobot_extra, "~> 4.0.0", app: false, manager: :rebar3}
  ]
end

Browser

System Installation

Chrobot Extra can use an existing system installation of Google Chrome or Chromium, if you already have one.

Browser Install Tool

Chrobot Extra comes with a simple utility to install a version of Google Chrome for Testing directly inside your project. Chrobot Extra will automatically pick up this local installation when started via the launch command, and will prioritise it over a system installation of Google Chrome.

You can run the browser installer tool from gleam like so:

gleam run -m chrobot_extra/install

Or when using Elixir with Mix:

mix run -e :chrobot_extra@install.main

Please check the install docs for more information – this installation method will not work everywhere and comes with some caveats!

GitHub Actions

If you want to use chrobot_extra inside a Github Action, for example to run integration tests, you can use the setup-chrome action to get a Chrome installation, like so:

# -- snip --
- uses: browser-actions/setup-chrome@v1
  id: setup-chrome
- run: gleam deps download
- run: gleam test
  env:
    CHROBOT_BROWSER_PATH: ${{ steps.setup-chrome.outputs.chrome-path }}

If you are using launch to start chrobot_extra, it should pick up the Chrome executable from CHROBOT_BROWSER_PATH.

Examples

Take a screenshot of a website

import chrobot_extra

pub fn main() {
  // Open the browser and navigate to the gleam homepage
  let assert Ok(browser) = chrobot_extra.launch()
  let assert Ok(page) =
    browser
    |> chrobot_extra.open("https://gleam.run", 30_000)
  let assert Ok(_) = chrobot_extra.await_selector(page, "body")
  
  // Take a screenshot and save it as 'hi_lucy.png'
  let assert Ok(screenshot) = chrobot_extra.screenshot(page)
  let assert Ok(_) = chrobot_extra.to_file(screenshot, "hi_lucy")
  let assert Ok(_) = chrobot_extra.quit(browser)
}

Generate a PDF document with lustre

import chrobot_extra
import lustre/element.{text}
import lustre/element/html

fn build_page() {
  html.body([], [
    html.h1([], [text("Spanakorizo")]),
    html.h2([], [text("Ingredients")]),
    html.ul([], [
      html.li([], [text("1 onion")]),
      html.li([], [text("1 clove(s) of garlic")]),
      html.li([], [text("70 g olive oil")]),
      html.li([], [text("salt")]),
      html.li([], [text("pepper")]),
      html.li([], [text("2 spring onions")]),
      html.li([], [text("1/2 bunch dill")]),
      html.li([], [text("250 g round grain rice")]),
      html.li([], [text("150 g white wine")]),
      html.li([], [text("1 liter vegetable stock")]),
      html.li([], [text("1 kilo spinach")]),
      html.li([], [text("lemon zest, of 2 lemons")]),
      html.li([], [text("lemon juice, of 2 lemons")]),
    ]),
    html.h2([], [text("To serve")]),
    html.ul([], [
      html.li([], [text("1 lemon")]),
      html.li([], [text("feta cheese")]),
      html.li([], [text("olive oil")]),
      html.li([], [text("pepper")]),
      html.li([], [text("oregano")]),
    ]),
  ])
  |> element.to_document_string()
}

pub fn main() {
  let assert Ok(browser) = chrobot_extra.launch()
  let assert Ok(page) =
    browser
    |> chrobot_extra.create_page(build_page(), 10_000)

  // Store as 'recipe.pdf'
  let assert Ok(doc) = chrobot_extra.pdf(page)
  let assert Ok(_) = chrobot_extra.to_file(doc, "recipe")
  let assert Ok(_) = chrobot_extra.quit(browser)
}

Scrape a Website

🍄‍🟫 Just a quick reminder:
Please be mindful of the load you are putting on other people’s web services when you are scraping them programmatically!

import chrobot_extra
import gleam/io
import gleam/list
import gleam/result

pub fn main() {
  let assert Ok(browser) = chrobot_extra.launch()
  let assert Ok(page) =
    browser
    |> chrobot_extra.open("https://books.toscrape.com/", 30_000)

  let assert Ok(_) = chrobot_extra.await_selector(page, "body")
  let assert Ok(page_items) = chrobot_extra.select_all(page, ".product_pod h3 a")
  let assert Ok(title_results) =
    list.map(page_items, fn(i) { chrobot_extra.get_attribute(page, i, "title") })
    |> result.all()
  io.debug(title_results)
  let assert Ok(_) = chrobot_extra.quit(browser)
}

Write an Integration Test for a WebApp

import chrobot_extra
import gleam/dynamic
import gleeunit/should

pub fn package_search_test() {
  let assert Ok(browser) = chrobot_extra.launch()
  use <- chrobot_extra.defer_quit(browser)
  let assert Ok(page) = chrobot_extra.open(browser, "https://hexdocs.pm/", 10_000)
  let assert Ok(input_field) = chrobot_extra.await_selector(page, "input#search")
  let assert Ok(Nil) = chrobot_extra.focus(page, input_field)
  let assert Ok(Nil) = chrobot_extra.type_text(page, "chrobot_extra")
  let assert Ok(Nil) = chrobot_extra.press_key(page, "Enter")
  let assert Ok(result_link) = chrobot_extra.await_selector(page, "#search-results a")
  let assert Ok(package_href) =
    chrobot_extra.get_property(page, result_link, "href", dynamic.string)
  package_href
  |> should.equal("https://hexdocs.pm/chrobot_extra/")
}

Use from Elixir

# ( output / logging removed for brevity )
iex(1)> {:ok, browser} = :chrobot_extra.launch()
iex(2)> {:ok, page} = :chrobot_extra.open(browser, "https://example.com", 10_000)
iex(3)> {:ok, object} = :chrobot_extra.select(page, "h1")
iex(4)> {:ok,text} = :chrobot_extra.get_text(page, object)
iex(5)> text
"Example Domain"

Documentation & Guide

The full documentation can be found at https://hexdocs.pm/chrobot_extra.

🗼 To learn about the high level abstractions, look at the chrobot_extra module documentation.

📠 To learn how to use the protocol bindings directly, look at the protocol module documentation.