Mechanize v0.1.0 Mechanize.Page View Source

The HTML Page.

This module defines Mechanize.Page and the main functions for working with Pages.

The Page is created as a result of a successful HTTP request.

alias Mechanize.{Browser, Page}

browser = Browser.new()
page = Browser.get!(browser, "https://www.example.com")

Link to this section Summary

Types

A fragment of a page. It is an array of Mechanize.Page.Element struct in most of the cases, but it could be any struct that implements Mechanize.Page.Elementable protocol.

t()

The HTML Page struct.

Functions

Clicks on a link that matches query.

Search for elements on a given page or fragment both using a CSS selector and queries.

Returns all elements not matching the selector.

Returns the first form in a given page or fragment or nil in case of the given page or fragment does not have a form.

Returns the first form that matches the query for the given page or fragment.

Returns a list containing all forms of a given page or fragment.

Returns a list containing all forms matching query for the given page or fragment.

Returns the browser that fetched the page.

Returns the page content.

Returns the response headers of a page.

Return the response of a page.

Returns the page url.

Return the first link matched by query.

Return the first link matched by query.

Returns a list containing all links from a page or fragment of a page, or an empty list in case it has no links.

Return all links matched by query.

Return all links matched by query.

Extracts meta-refresh data from a page.

Search for elements on a given page or fragment using a CSS selector.

Link to this section Types

Specs

fragment() :: [any()]

A fragment of a page. It is an array of Mechanize.Page.Element struct in most of the cases, but it could be any struct that implements Mechanize.Page.Elementable protocol.

Specs

t() :: %Mechanize.Page{
  browser: Browser.t(),
  content: String.t(),
  parser: module(),
  response_chain: [Mechanize.Response.t()],
  status_code: integer(),
  url: String.t()
}

The HTML Page struct.

Link to this section Functions

Link to this function

click_link!(page_or_fragment, query)

View Source

Specs

click_link!(t() | fragment(), Mechanize.Query.t()) :: t()

Clicks on a link that matches query.

Links are all elements defined by a and area html tags. In case of more than one link matches the query, Mechanize will click on the first matched link.

Raises Mechanize.Page.ClickError if the matched link has no href attribute.

Raises Mechanize.Page.BadQueryError if no link matches with given query.

Raises additional exceptions from Mechanize.Browser.request!/5.

See Mechanize.Query module documentation to know all query capabilities in depth.

Examples

Click on the first link with text equals to "Back":

  Page.click_link!(page, "Back")

Click on the first link by its "href" attribute:

  Page.click_link!(page, href: "sun.html")
Link to this function

elements_with(page_or_fragment, selector, query \\ [])

View Source

Specs

elements_with(t() | fragment(), String.t(), Mechanize.Query.t()) :: [
  Mechanize.Page.Element.t()
]

Search for elements on a given page or fragment both using a CSS selector and queries.

This function is similar to Mechanize.Page.search/2, but you can also use the power of queries combined. First, the function will match the page or the fragments against the CSS selector, after it will perform a match of the remaining elements to the query. A list of Mechanize.Page.Element will be return. In case of no element both matches the selector and the query, an empty list will be returned instead.

See Mechanize.Query module documentation to know all query capabilities in depth.

Example

Printing in console todos of a todo html unordered list starting with "A":

page
|> Page.elements_with("ul.todo > li", text: ~r/^A/i)
|> Enum.map(&Element.text/1)
|> Enum.each(&IO.puts/1)
Link to this function

filter_out(page, selector)

View Source

Specs

filter_out(t() | fragment(), String.t()) :: [Mechanize.Page.Element.t()]

Returns all elements not matching the selector.

A list of Mechanize.Page.Element matching the selector will be return. In case of all elements match the selector, and empty list will be returned instead.

Example

Removing a unordered list with "todo" class from the content of a page.

Page.filter_out(page, "ul.todo > li")

Specs

form(t() | fragment()) :: Mechanize.Form.t() | nil

Returns the first form in a given page or fragment or nil in case of the given page or fragment does not have a form.

Link to this function

form_with(page_or_fragment, query \\ [])

View Source

Specs

form_with(t() | fragment(), Mechanize.Query.t()) :: Mechanize.Form.t() | nil

Returns the first form that matches the query for the given page or fragment.

In case of no form matches, returns nil instead.

See Mechanize.Query module documentation to know all query capabilities in depth.

Examples

Fetch the first form which name is equal to "login".

%Form{} = Page.form_with(page, name: "login")

Specs

forms(t() | fragment()) :: [Mechanize.Form.t()]

Returns a list containing all forms of a given page or fragment.

In case of a page or fragment does not have a form, returns a empty list.

Link to this function

forms_with(page_or_fragment, query \\ [])

View Source

Specs

forms_with(t() | fragment(), Mechanize.Query.t()) :: [Mechanize.Form.t()]

Returns a list containing all forms matching query for the given page or fragment.

In case of no form matches, returns an empty list instead.

See Mechanize.Query module documentation to know all query capabilities in depth.

Examples

Fetch all forms which name is equal to "login".

list = Page.forms_with(page, name: "login")

Specs

get_browser(t()) :: Browser.t()

Returns the browser that fetched the page.

Specs

get_content(t()) :: String.t()

Returns the page content.

Specs

get_headers(t()) :: Header.headers()

Returns the response headers of a page.

In case of Mechanize Browser has followed one or more redirects when page was fetched, the headers returned corresponds to the headers of the last response.

Specs

get_response(t()) :: Mechanize.Response.t()

Return the response of a page.

In case of Mechanize Browser has followed one or more redirects when page was fetched, the response returned correspond to the last respose.

Specs

get_url(t()) :: String.t()

Returns the page url.

Specs

links(t() | fragment()) :: [Mechanize.Page.Link.t()]

Returns a list containing all links from a page or fragment of a page, or an empty list in case it has no links.

Specs

meta_refresh(t()) :: {integer(), String.t()}

Extracts meta-refresh data from a page.

A two element tuple with a integer representing the delay in the first position and the a string representing the URL in the second position will be returned if a <meta http-equiv="refresh" ...> is found, otherwise nil will be returned.

Raises Mechanize.Page.InvalidMetaRefreshError if Mechanize cannot parse the content attribute of the meta-refresh.

Example

# <meta http-equiv="refresh" content="10; url=https://www.example.com">
{delay, url} = Page.meta_refresh(page)

delay # => 10
url # => https://www.example.com

Specs

search(t() | fragment(), String.t()) :: [Mechanize.Page.Element.t()]

Search for elements on a given page or fragment using a CSS selector.

A list of Mechanize.Page.Element matching the selector will be return. In case of no element matches the selector, an empty list will be returned instead.

See also Mechanize.Page.elements_with/3.

Example

Printing in console todos of a todo html unordered list:

page
|> Page.search("ul.todo > li")
|> Enum.map(&Element.text/1)
|> Enum.each(&IO.puts/1)