View Source ReqBigQuery (ReqBigQuery v0.1.4)

Req plugin for Google BigQuery.

ReqBigQuery makes it easy to make BigQuery queries. It uses Goth for authentication. Query results are decoded into the ReqBigQuery.Result struct. The struct implements the Table.Reader protocol and thus can be efficiently traversed by rows or columns.

Summary

Functions

Attaches to Req request.

Functions

Link to this function

attach(request, options \\ [])

View Source
@spec attach(
  Req.Request.t(),
  keyword()
) :: Req.Request.t()

Attaches to Req request.

Request Options

  • :goth - Required. The goth server name.

  • :project_id - Required. The GCP project id.

  • :bigquery - Required. The query to execute. It can be a plain sql string or a {query, params} tuple, where query can contain ? placeholders and params is a list of corresponding values.

  • :default_dataset_id - Optional. If set, the dataset to assume for any unqualified table names in the query. If not set, all table names in the query string must be qualified in the format 'datasetId.tableId'.

  • :max_results - Optional. Number of rows to be returned by BigQuery in each request (paging). The rows Stream can make multiple requests if num_rows returned is grather than :max_results. Defaults to 10000.

  • :use_legacy_sql - Optional. Specifies whether to use BigQuery's legacy SQL dialect for this query. If set to false, the query will use BigQuery's GoogleSQL: https://cloud.google.com/bigquery/sql-reference/ The default value is false.

  • :timeout_ms - Optional. How long to wait for the query to complete, in milliseconds, before the request times out and returns. Note: The call is not guaranteed to wait for the specified timeout. If the query takes longer to run than the timeout value, the call returns without any results and with the 'jobComplete' flag set to false. The default value is 10000 milliseconds (10 seconds).

  • :dry_run - Optional. Specifies whether to run the given query in dry run mode.

If you want to set any of these options when attaching the plugin, pass them as the second argument.

Examples

With plain query string:

iex> credentials = File.read!("credentials.json") |> Jason.decode!()
iex> source = {:service_account, credentials, []}
iex> {:ok, _} = Goth.start_link(name: MyGoth, source: source, http_client: &Req.request/1)
iex> project_id = System.fetch_env!("PROJECT_ID")
iex> query = """
...> SELECT title, SUM(views) AS views
...>   FROM `bigquery-public-data.wikipedia.table_bands`
...>  WHERE EXTRACT(YEAR FROM datehour) <= 2021
...>  GROUP BY title
...>  ORDER BY views DESC
...>  LIMIT 10
...> """
iex> req = Req.new() |> ReqBigQuery.attach(goth: MyGoth, project_id: project_id)
iex> res = Req.post!(req, bigquery: query).body
iex> res
%ReqBigQuery.Result{
  columns: ["title", "views"],
  job_id: "job_JDDZKquJWkY7x0LlDcmZ4nMQqshb",
  num_rows: 10,
  total_bytes_processed: 18161868216,
  rows: %Stream{}
}
iex> Enum.to_list(res.rows)
[
    ["The_Beatles", 13758950],
    ["Queen_(band)", 12019563],
    ["Pink_Floyd", 9522503],
    ["AC/DC", 8972364],
    ["Led_Zeppelin", 8294994],
    ["Linkin_Park", 8242802],
    ["The_Rolling_Stones", 7825952],
    ["Red_Hot_Chili_Peppers", 7302904],
    ["Fleetwood_Mac", 7199563],
    ["Twenty_One_Pilots", 6970692]
]

With parameterized query:

iex> credentials = File.read!("credentials.json") |> Jason.decode!()
iex> source = {:service_account, credentials, []}
iex> {:ok, _} = Goth.start_link(name: MyGoth, source: source, http_client: &Req.request/1)
iex> project_id = System.fetch_env!("PROJECT_ID")
iex> query = """
...> SELECT EXTRACT(YEAR FROM datehour) AS year, SUM(views) AS views
...>   FROM `bigquery-public-data.wikipedia.table_bands`
...>  WHERE EXTRACT(YEAR FROM datehour) <= 2021
...>    AND title = ?
...>  GROUP BY 1
...>  ORDER BY views DESC
...> """
iex> req = Req.new() |> ReqBigQuery.attach(goth: MyGoth, project_id: project_id)
iex> res = Req.post!(req, bigquery: {query, ["Linkin_Park"]}).body
%ReqBigQuery.Result{
  columns: ["year", "views"],
  job_id: "job_GXiJvALNsTAoAOJ39Eg3Mw94XMUQ",
  num_rows: 7,
  total_bytes_processed: 15686357820,
  rows: %Stream{}
}
iex> Enum.to_list(res.rows)
[[2017, 2895889], [2016, 1173359], [2018, 1133770], [2020, 906538], [2015, 860899], [2019, 790747], [2021, 481600]]