chrobot

Welcome to Chrobot! 🤖 This module exposes high level functions for browser automation.

Some basic concepts:

The functions in this module just make calls to protocol/ modules, if you would like to customize the behaviour, take a look at them to see how to make direct protocol calls and pass different defaults.

Something to consider:
A lot of the functions in this module are interpolating their parameters into
JavaScript expressions that are evaluated in the page context.
No attempt is made to escape the passed parameters or prevent script injection through them, you should not use the functions in this module with arbitrary strings if you want to treat the pages you are operating on as a secure context.

Types

pub type CallArgument {
  StringArg(value: String)
  IntArg(value: Int)
  FloatArg(value: Float)
  BoolArg(value: Bool)
  ArrayArg(value: List(CallArgument))
}

Constructors

  • StringArg(value: String)
  • IntArg(value: Int)
  • FloatArg(value: Float)
  • BoolArg(value: Bool)
  • ArrayArg(value: List(CallArgument))
pub type EncodedFile {
  EncodedFile(data: String, extension: String)
}

Constructors

  • EncodedFile(data: String, extension: String)

Holds information about the current page, as well as the desired timeout in milliseconds to use when waiting for browser responses.

pub type Page {
  Page(
    browser: Subject(chrome.Message),
    time_out: Int,
    target_id: target.TargetID,
    session_id: target.SessionID,
  )
}

Constructors

  • Page(
      browser: Subject(chrome.Message),
      time_out: Int,
      target_id: target.TargetID,
      session_id: target.SessionID,
    )

Functions

pub fn as_value(
  result: Result(RemoteObject, RequestError),
  decoder: fn(Dynamic) -> Result(a, b),
) -> Result(a, RequestError)

Cast a RemoteObject into a value by passing a dynamic decoder.
This is a convenience for when you know a RemoteObject is returned by value and not ID, and you want to extract the value from it.
You can chain this to eval or eval_async like so:

eval(page, "window.location.href")
  |> as_value(dynamic.string)
pub fn await_load_event(
  browser: Subject(Message),
  page: Page,
) -> Result(Dynamic, RequestError)

Block until the page load event has fired. Note that with local pages, the load event can often fire before the handler is attached.
It’s best to use await_selector instead of this

pub fn await_selector(
  on page: Page,
  select selector: String,
) -> Result(RemoteObjectId, RequestError)

Continously attempt to run a selector, until it succeeds.
You can use this after opening a page, to wait for the moment it has initialized enough sufficiently for you to run your automation on it.
The final result will be single remote object id

pub fn call_custom_function_on(
  callback: fn(String, Option(Json)) ->
    Result(Dynamic, RequestError),
  function_declaration function_declaration: String,
  object_id object_id: RemoteObjectId,
  args arguments: List(CallArgument),
  value_decoder value_decoder: fn(Dynamic) -> Result(a, b),
) -> Result(a, RequestError)

This is a version of runtime.call_function_on which allows passing in arguments, and always returns the result as a value, which will be decoded by the decoder you pass in

You would use it with a JavaScript function declaration like this:

function my_function(my_arg) {
  // You can access the passed RemoteObject with `this`
  const wibble = this.getAttribute('href')
  // You have access to the arguments you passed in
  const wobble = 'hello ' + my_arg
  // You receive this return value, you should pass in a string decoder
  // in this case
  return wibble + wobble;
}
pub fn close(page: Page) -> Result(Dynamic, RequestError)

Close the passed page

pub fn create_page(
  with browser: Subject(Message),
  from html: String,
  time_out time_out: Int,
) -> Result(Page, RequestError)

Similar to open, but creates a new page from HTML that you pass to it. The page will be created under the about:blank URL.

pub fn defer_quit(
  browser: Subject(Message),
  body: fn() -> a,
) -> Result(Nil, CallError(Nil))

Convenience function that lets you defer quitting the browser after you are done with it, it’s meant for a use expression like this:

let assert Ok(browser_subject) = browser.launch()
use <- browser.defer_quit(browser_subject)
// do stuff with the browser
pub fn eval(
  on page: Page,
  js expression: String,
) -> Result(RemoteObject, RequestError)

Evaluate some JavaScript on the page and return the result, which will be a RemoteObject reference.
Check the protocol/runtime module for more info.

pub fn eval_async(
  on page: Page,
  js expression: String,
) -> Result(RemoteObject, RequestError)

Like eval, but awaits for the result of the evaluation and returns once promise has been resolved

pub fn eval_to_value(
  on page: Page,
  js expression: String,
) -> Result(RemoteObject, RequestError)
pub fn get_all_html(
  on page: Page,
) -> Result(String, RequestError)
pub fn get_attribute(
  on page: Page,
  from item: RemoteObjectId,
  name attribute_name: String,
) -> Result(String, RequestError)

Assuming the passed remote object reference is an Element, return an attribute of that element.
Attributes are always returned as a string.

pub fn get_inner_html(
  on page: Page,
  from item: RemoteObjectId,
) -> Result(String, RequestError)
pub fn get_outer_html(
  on page: Page,
  from item: RemoteObjectId,
) -> Result(String, RequestError)
pub fn get_property(
  on page: Page,
  from item: RemoteObjectId,
  name property_name: String,
  property_decoder property_decoder: fn(Dynamic) -> Result(a, b),
) -> Result(a, RequestError)

Get a property of a remote object and decode it with the provided decoder

pub fn get_text(
  on page: Page,
  from item: RemoteObjectId,
) -> Result(String, RequestError)

Note: Accesses the innerText property, not textContent

pub fn launch() -> Result(Subject(Message), LaunchError)

Cleverly try to find a chrome installation and launch it with reasonable defaults.

  1. If CHROBOT_BROWSER_PATH is set, use that
  2. If a local chrome installation is found, use that
  3. If a system chrome installation is found, use that
  4. If none of the above, return an error

If you want to always use a specific chrome installation, take a look at launch_with_config or launch_with_env to set the path explicitly.

This function will validate that the browser launched successfully, and the protocol version matches the one supported by this library.

pub fn launch_with_config(
  config: BrowserConfig,
) -> Result(Subject(Message), LaunchError)

Launch a browser with the given configuration, to populate the arguments, use browser.get_default_chrome_args. This function will validate that the browser launched successfully, and the protocol version matches the one supported by this library.

Example

let config =
browser.BrowserConfig(
  path: "chrome/linux-116.0.5793.0/chrome-linux64/chrome",
  args: chrome.get_default_chrome_args(),
  start_timeout: 5000,
)
let assert Ok(browser_subject) = launch_with_config(config)
pub fn launch_with_env() -> Result(Subject(Message), LaunchError)

Launch a browser, and read the configuration from environment variables. The browser path variable must be set, all others will fall back to a default.

This function will validate that the browser launched successfully, and the protocol version matches the one supported by this library.

Configuration variables:

  • CHROBOT_BROWSER_PATH - The path to the browser executable
  • CHROBOT_BROWSER_ARGS - The arguments to pass to the browser, separated by spaces
  • CHROBOT_BROWSER_TIMEOUT - The timeout in milliseconds to wait for the browser to start, must be an integer
  • CHROBOT_LOG_LEVEL - The log level to use, one of silent, warnings, info, debug
pub fn open(
  with browser_subject: Subject(Message),
  to url: String,
  time_out time_out: Int,
) -> Result(Page, RequestError)

Open a new page in the browser. Returns a response when the protocol call succeeds, please use await_selector to determine when the page is ready.
The timeout passed to this function will be attached to the returned Page type to be reused by other functions in this module.
You can always adjust it using with_timeout.

pub fn page_caller(
  page: Page,
) -> fn(String, Option(Json)) -> Result(Dynamic, RequestError)

Create callback to pass to protocol commands from a Page

pub fn pdf(page: Page) -> Result(EncodedFile, RequestError)

Export the current page as PDF and return it as a base64 encoded string.
Transferring the encoded file from the browser to the chrome agent can take a pretty long time, depending on the document size.
Consider setting a larger timeout, you can use with_timeout on your existing Page to do this. The Ok(result) of this function can be passed to to_file

If you want to customize the settings of the output document, use print_to_pdf from protocol/page directly

pub fn quit(
  browser: Subject(Message),
) -> Result(Nil, CallError(Nil))

Quit the browser (alias for chrome.quit)

pub fn screenshot(
  page: Page,
) -> Result(EncodedFile, RequestError)

Capture a screenshot of the current page and return it as a base64 encoded string The Ok(result) of this function can be passed to to_file

If you want to customize the settings of the output image, use capture_screenshot from protocol/page directly

pub fn select(
  on page: Page,
  matching selector: String,
) -> Result(RemoteObjectId, RequestError)
pub fn select_all(
  on page: Page,
  matching selector: String,
) -> Result(List(RemoteObjectId), RequestError)

Run querySelectorAll on the page and return a list of remote object ids

pub fn to_file(
  input input: EncodedFile,
  path path: String,
) -> Result(Nil, FileError)
pub fn to_value(
  on page: Page,
  from remote_object_id: RemoteObjectId,
  to decoder: fn(Dynamic) -> Result(a, b),
) -> Result(a, RequestError)

Evalute a remote object to a value, passing in the appropriate decoder function

pub fn with_timeout(page: Page, time_out: Int) -> Page

Return an updated Page with the desired timeout to apply, in milliseconds

Search Document