Vessel v0.8.0 Vessel
The main interface for interacting with Vessel from withing application code.
This module contains many utilities related to interacting with the Vessel Job context, as well as convenience functions for logging, and writing values to the next Job steps.
Any function in this module should require the Job context as the first param, in order to future proof in case of new configuration options being added.
Summary
Functions
Creates a new Vessel context using the provided pairs
Retrieves a value from the Job configuration
Retrieves a meta key and value from the context
Retrieves a private key and value from the context
Inspects a value and outputs to the Hadoop logs
Outputs a message to the Hadoop logs
Modifies a top level field in the Vessel context
Sets a variable in the Job configuration
Stores a meta key and value inside the context
Stores a private key and value inside the context
Updates a Hadoop Job counter
Updates the status of the Hadoop Job
Writes a key/value Tuple to the Job context
Writes a value to the Job context for a given key
Types
Functions
Creates a new Vessel context using the provided pairs.
The pairs provided overwrite the defaults. Context must be created this way as
defaults can’t be provided at compile time (because things like :conf
use
runtime values).
Retrieves a value from the Job configuration.
Configuration values are treated as environment variables to conform to Hadoop Streaming. We clone the environment into the context (to avoid setting the env values rather than the job variables).
We only allow lower case variables to enter the Job configuration, as this is the model used by Hadoop Streaming. This also filters out a lot of noise from default shell variables polluting the configuration (e.g. $HOME etc).
Using environment variables means that there’s a slight chance that you’ll receive a value from the env which isn’t actually a configuration variable, so please validate appropriately.
Retrieves a meta key and value from the context.
This should not be used outside of the library modules.
Retrieves a private key and value from the context.
An optional default value can be provided to be returned if the key does not
exist in the private context. If not provided, nil
will be used.
Inspects a value and outputs to the Hadoop logs.
You can pass your value as either the first or second argument, as long as the
other one is a Vessel context - this is to make it easier to chain, in the same
way you would with IO.inspect/2
.
This function uses :stderr
as Hadoop is listening to all :stdio
output as
the results of your mapper - so going via :stdio
would corrupt the Job values.
Outputs a message to the Hadoop logs.
This function uses :stderr
as Hadoop is listening to all :stdio
output as
the results of your mapper - so going via :stdio
would corrupt the Job values.
Modifies a top level field in the Vessel context.
This should not be used externally to the library itself, as it can error when used incorrectly (for example with invalid keys).
Sets a variable in the Job configuration.
This operates in a similar way to put_private/3
except that it should only
be used for Job configuration values (as a semantic difference).
This does not set the variable in the environment, as we clone the environment Job configuration on startup to avoid polluting the environment.
Stores a meta key and value inside the context.
This should not be used outside of the library modules.
Stores a private key and value inside the context.
This is where you can persist values between steps in the Job. You can think of it as the Job state. You should only change things in this Map, rather than placing things in the top level of the Job context.
update_counter(Vessel.t, binary, binary, number) :: :ok
Updates a Hadoop Job counter.
This is a utility function to emit a Job counter in Hadoop Streaming. You may
provide a custom amount to increment by, which defaults to 1
if not provided.
Updates the status of the Hadoop Job.
This is a utility function to emit status in Hadoop Streaming.
Writes a key/value Tuple to the Job context.
To stay compatible with Hadoop Streaming, this will emit to :stdio
in the
required format.
Writes a value to the Job context for a given key.
To stay compatible with Hadoop Streaming, this will emit to :stdio
in the
required format. The separator can be customized by settings custom separators
inside the :meta map, and is modified as such by the mapper/reducer phases.