View Source ReqAthena (ReqAthena v0.3.0)
Req
plugin for AWS Athena.
ReqAthena makes it easy to make Athena queries and save the results into S3 buckets.
By default, ReqAthena
will query results and use the default output format,
which is CSV. To change that, you can use the :format
option documented bellow.
Summary
Functions
@spec new(keyword()) :: Req.Request.t()
Builds a new %Req.Request{}
for Athena requests.
Request Options
:access_key_id
- Required. The Access Key ID from AWS credentials.:secret_access_key
- Required. The Secret Access Key from AWS credentials.:token
- Optional. The Session Token from AWS credentials.:region
- Required. The AWS region where AWS Athena is installed.:database
- Required. The AWS Athena database name.:output_location
- Optional. The S3 URL location to output AWS Athena query results. Results will be saved as Parquet and loaded with Explorer only if this option is given.:workgroup
- Conditional. The AWS Athena workgroup.:cache_query
- Optional. Forces a non-cached result from AWS Athena.:format
- Optional. The output format. Can be one of::none
(default) - return decoded API response from Athena.:csv
- return contents of the CSV file.:json
- return contents of the JSON file.Note: Req by default automatically decodes JSON response body (
Req.Steps.decode_body/1
step) and to prevent it from doing so, setdecode_body: false
.:explorer
- return contents in parquet format, lazy loaded into Explorer data frame.
There are some limitations when using the
:json
and:explorer
format. First, you need to install Explorer in order to use the:explorer
format. Second, when using these format, you always need to provide a different output location. See theUNLOAD
command docs for more details.:output_compression
- Optional. Sets the Parquet compression format and level for the output when using the Explorer output format. This can be a string, like"gzip"
, or a tuple with{format, level}
, like:{"ZSTD", 4}
. By default this isnil
, which means that for Parquet (the format that Explorer uses) this is going to be"gzip"
.
There is a limitation of Athena that requires the :output_location
to be present
for every query that outputs to a format other than "CSV". So we append "results"
to the :output_location
to make the partition files be saved there.
Conditional fields must always be defined, and can be one of the fields or both.
@spec query(Req.Request.t(), binary(), list(), Keyword.t()) :: {:ok, Req.Response.t()} | {:error, Exception.t()}
Performs a query against the Athena API.
The SQL query can container ?
placeholders and sql_query_params
is a list of corresponding values.
This function accepts the same options as new/1
.
Examples
With plain query:
iex> opts = [
...> access_key_id: System.fetch_env!("AWS_ACCESS_KEY_ID"),
...> secret_access_key: System.fetch_env!("AWS_SECRET_ACCESS_KEY"),
...> region: System.fetch_env!("AWS_REGION"),
...> database: "default",
...> output_location: System.fetch_env!("AWS_ATHENA_OUTPUT_LOCATION")
...> ]
iex> req = ReqAthena.new(opts)
iex> query = "SELECT id, type, tags, members, timestamp, visible FROM planet WHERE id = 470454 and type = 'relation'"
iex> ReqAthena.query!(req, query, [], format: :json).body
%{
"id" => 470454,
"members" => [
%{"ref" => 670007839, "role" => "", "type" => "node"},
%{"ref" => 670007840, "role" => "", "type" => "node"}
],
"tags" => %{
"name" => "Mérignac A",
"network" => "NTF-5",
"ref" => "17229A",
"site" => "geodesic",
"source" => "©IGN 2010 dans le cadre de la cartographie réglementaire",
"type" => "site",
"url" => "http://geodesie.ign.fr/fiches/index.php?module=e&action=fichepdf&source=carte&sit_no=17229A"
},
"timestamp" => "2017-01-21 12:51:34",
"type" => "relation",
"visible" => true
}
With parameterized query:
iex> opts = [
...> access_key_id: System.fetch_env!("AWS_ACCESS_KEY_ID"),
...> secret_access_key: System.fetch_env!("AWS_SECRET_ACCESS_KEY"),
...> region: System.fetch_env!("AWS_REGION"),
...> database: "default",
...> output_location: System.fetch_env!("AWS_ATHENA_OUTPUT_LOCATION")
...> ]
iex> req = ReqAthena.new(opts)
iex> query = "SELECT id, type FROM planet WHERE id = ? and type = ?"
iex> ReqAthena.query!(req, query, [239_970_142, "node"], format: :json).body
[%{"id" => 239970142, "type" => "node"}]
@spec query!(Req.Request.t(), binary(), list(), Keyword.t()) :: Req.Response.t()
Same as query/4
, but raises in case of error.