View Source erlarg - v1.0.0
An Erlang lib that parsed a list of arguments into structured data.
Useful for handling options/parameters of escript
Installation
Add erlarg to in the deps of your rebar.config:
{deps, [{erlarg, "1.0.0"}]}
% or
{depts, [{erlarg, {git, "https://github.com/Eptwalabha/erlarg.git", {tag, "v1.0.0"}}}]}If you're building an escript, add erlarg to the list of apps to include in the binary
{escript_incl_apps, [erlarg, …]}.fetch and compile the dependencies of your project:
rebar3 compile --deps_only
That's it, you're good to go.
How does it work ?
Imagine this command :
./my-script --limit=20 -m 0.25 --format "%s%t" -o output.tsv -
The main/1 function of my-script will receive this list of arguments:
["--limit=20", "-m", "0.25", "--format", "%s%t", "-o", "output.tsv", "-"]The function erlarg:parse will help you convert them into a structured data:
main(Args) ->
Syntax = {any, [erlarg:opt({"-l", "--limit"}, limit, int),
erlarg:opt({"-f", "--format"}, format, binary),
erlarg:opt("-o", file, string),
erlarg:opt("-", stdin),
erlarg:opt({"-m", "--max"}, max, float)
]}
{ok, {Result, RemainingArgs} = erlarg:parse(Args, Syntax),
...For this example, parse will return this proplist:
% Result
[{limit, 20},
{max, 0.25},
{format, <<"%s%t">>},
{file, "output.tsv"},
stdin].The functions erlarg:parse/2 & erlarg:parse/3 will transform a list of arguments into a structured data.
Args: A list of arguments (generaly what's given tomain/1)Syntax: The syntax (or the specification) that describes how the arguments should be parsedAliases: [optional] the list of options, types or sub-syntax that are referenced inSyntax
Syntax
The syntax will describe to the parser how to handle each arguments (Args).
It will consume each argument one by one while building the structured data.
A syntax could be any of those things:
- a type
- a named type
- a custom type
- an option
- an alias
- a sub-syntax (which is a syntax itself)
- a syntax operator
- a list of all the above
It can be pretty complex, but for now, let's go simple.
Imagine this fictionnal script print_n_time that takes a string and an integer as argument
# this will print the string "hello" 3 times
$ print_n_time hello 3
Here's the simplest spec needed to handle the arguments:
Syntax = [string, int].
erlarg:parse(Args, Syntax). % here Args = ["hello", "3"]
{ok, {["hello", 3], []}} % erlang:parse/2 resultWe explicitly asked the parser to handle two arguments, the first <u>must</u> be a string, the second <u>must</u> be an int.
If if the parsing is successful, it will return the following tuple:
{ok, {Data, RemainingArgs}}.Where Data is the structured data generated by the parser (["hello", 3]) and RemainingArgs is the list of arguments not consumed by the parser ([]).
Parsing failure
If the parser encounter a problem with an argument, it will fail and return the nature of the problem:
> erlarg:parse(["world"], [int]).
{error, {not_int, "word"}} % it failed to convert the word "world" into an intor
> erlang:parse(["one"], [string, string]). % expect two strings but only got one
{error, {missing, arg}}[!TIP] These errors can be used to explain to the user what's wrong with the command it typed
Remaining Args
Remaining args are the arguments not consumed by the parser when this one terminates successfuly.
If we add some extra arguments at the end of our command:
$ print_n_time hello 3 some extra arguments
this time, calling erlarg:parse/2 with the same syntax as before will give this result:
Syntax = [string, int].
{ok, {_, RemainingArgs}} = erlarg:parse(Args, Syntax).
["some", "extra", "arguments"] % RemainingArgsThe parser will consume the two first arguments, the remaining argument will be returned in the RemainingArgs.
[!NOTE] Having unconsumed arguments does not generate an error
Types
The parser can convert the argument to more types than just string and int.
Here are all the types currently available :
int: cast the argument into an intfloat: cast the argument into a float (will cast int into float)number: cast the argument into an int. If it fails it will cast the argument into a floatstring: returns the given argumentbinary: cast the argument into a binary listatom: cast the arg to an atombool: return the boolean value of the arg
| syntax | arg | result | note |
|---|---|---|---|
| int | "1" | 1 | - |
| int | "1.2" | error | not an int |
| float | "1.2" | 1.2 | - |
| float | "1" | 1.0 | cast int into float |
| float | "1.234e2" | 123.4 | - |
| number | "1" | 1 | - |
| number | "1.2" | 1.2 | - |
| string | "abc" | "abc" | - |
| binary | "äbc" | <<"äbc"/utf8>> | use unicode:characters_to_binary |
| atom | "super-top" | 'super-top' | - |
the bool conversion:
| arg | bool | note |
|---|---|---|
| "true" | true | case insensitive |
| "yes" | true | |
| "abcd" | true | any non-empty string |
| "1" | true | |
| "0.00001" | true | |
| "false" | false | case insensitive |
| "no" | false | |
| "" | false | empty-string |
| "0" | false | |
| "0.0" | false |
[!TIP] converting an argument into
string,binary,booloratomit will always succeed.
If you need more complicated "type", see the chapter on Custom types
Naming parameters
Converting an argument into a specific type is important, but it doesn't really help us understand what these values are for:
> Syntax = [string, int].
> {ok, {Result, _}} = erlarg:parse(["hello", "3"], Syntax).
["hello", 3]. % ResultTo avoid this issue, you can give "name" to the parsed parameters with the following syntax:
{Name :: atom(), Type :: base_type()}If we rewrite the syntax as such:
Syntax = [{text, string()}, {nbr, int}].
{ok, {Result, _}} = erlarg:parse(["hello", "3"], Syntax).
[{text, "hello"}, {nbr, 3}] % Resultyou can even name a list of parameters if you want:
Syntax = [{a, [string, {a2, float}]}, {b, binary}],
{ok, {Result, _}} = erlang:parse(["abc", "2.3", "bin"], Syntax).
[{a, ["abc", {a2, 2.3}]}, {b, <<"bin">>}] % ResultOptions
Naming and casting parameters into types is neat, but most programs use options. An option is an argument that usually (not always…) starts with dash and has zero or more parameters.
$ date -d --utc --date=STRING
Option can have several formats a short one (a dash followed by a letter eg. -v) and/or a long one (double dash and a word eg. --version)
This table summarizes the formats handled/recognized by the parser:
| format | note |
|---|---|
| -s | |
| -s <u>VALUE</u> | |
| -s<u>VALUE</u> | same as -s VALUE |
| -abc <u>VALUE</u> | same as -a -b -c VALUE |
| -abc<u>VALUE</u> | same as -a -b -c VALUE |
| --long | |
| --long <u>VALUE</u> | |
| --long=<u>VALUE</u> |
In this chapter, we'll see how to tell the parser how to recognise three kind of options:
- option without parameter
- option with parameters
- option with sub options
option without parameter
$ grep -v "bad"
$ grep --invert-match "bad"
We can define this option with erlarg:opt like so:
> Syntax = [erlarg:opt({"-v", "--invert-match"}, invert_match)].
> {ok, {Result, _}} = erlarg:parse(["-v"], Syntax),
[invert_match] % ResultThe first parameter of erlarg:opt is the option:
{"-s", "--long"} % short and long options
"-s" % only short option
{"-s", undefined} % same as above
{undefined, "--long"} % only long optionThe second parameter is the name of the option, in this case invert_match
option with parameter(s)
Option can have parameters
$ date --date 'now -3 days'
$ date --date='now -3 days'
$ date -d'now -3 days'
> Syntax = [erlarg:opt({"-d", "--date"}, date, string)].
> {ok, {Result, _}} = erlarg:parse(["--date", "now -3 days"], date, string).
[{date, "now -3 days"}] % ResultThe third parameter is the syntax of the parameters expected by the option. In this case after matching the argument --date this option is expecting a string ("now -3 days").
Maybe one of the option of your program is expecting two parameters ? No problem :
erlang:opt({"-d", "--dimension"}, dimension, [int, string]}).
[{dimension, [3, "inch"]}] % Result for "-d 3 inch"You can even use name
erlang:opt({"-d", "--dimension"}, dimension, [{size, int}, {unit, string}]).
[{dimension, [{size, 3}, {unit, "inch"}]}] % Result for "-d 3 inch"option with sub-option(s):
Because the third parameter is a syntax, and because an option is a syntax itself, that means you can put options into option :
$ my-script --opt1 -a "param of a" -b "param of opt1" --opt2 …
In this fictionnal program, the option --opt1 has two sub-options (-a that expects a parameter and -b that doesn't). We can define opt1 this way:
Opt1 = erlarg:opt({"-o", "--opt1"}, % option
opt1, % option's name
[erlarg:opt("-a", a, string), % sub-option 1
erlarg:opt("-b", b), % sub-option 2
{value, string} % the param under the name 'value'
]).
{ok, {Result, _}} = erlarg:parse(["--opt1", "-a", "abc", "-b", "def"], Opt1).
[{opt1, [{a, "abc"}, b, {value, "def"}]}] % ResultWell… that's quite unreadable… fortunately, you can use Aliases to avoid this mess.
Aliases
Aliases, let you define all your options, sub-syntax and custom types in a map. It helps keep the Syntax clear and readable.
Aliases = #{
option1 => erlarg:opt({"-o", "--opt1"}, opt1, [opt_a, opt_b, {value, string}]),
option2 => erlarg:opt({undefined, "--opt2"}, opt2),
opt_a => erlarg:opt("-a", a, string),
opt_b => erlarg:opt("-b", b)
},
Syntax = [option1, option2],
{ok, {Result, _}} = erlarg:parse(["--opt1", "-a", "abc", "-b", "def", "--opt2"],
Syntax, Aliases).
[{opt1, [{a, "abc"}, b, {value, "def"}]}, opt2] % ResultHere Syntax is a list of two aliases, option1 and option2
Syntax operators
Operator tells the parser how to handle a list of syntax
sequence operator
Take the following syntax:
[opt({"-d", "--date"}, date, string), opt({"-u", "--utc"}, utc)]It would parse this command without problem:
$ date -d "now -3 days" --utc # yay!
But will crash with this one:
$ date --utc --date="now -3 days" # boom !
Why ? Aren't these two commands identical ?
That's because a list of syntax is considered by the parser as a sequence operator :
[syntax1, syntax2, …]A sequence is expecting the arguments to match in the same order as the elements of the list. The first argument must match syntax1, the second syntax2, …) if any fails, the whole sequence fails.
All elements of the list must succeed in order for the operator to succeed.
| syntax | args | result | note |
|---|---|---|---|
| [int, string] | ["1", "a"] | [1, "a"] | |
| [int] | ["1", "a"] | [1] | remaining: ["a"] |
| [int, int] | ["1", "a"] | error | "a" isn't an int |
| [int, string, int] | ["1", "a"] | error | missing a third argument |
So how to parse arguments if we're not sure of they order… moreover, some option are… optionnal ! how do we do ?
That's where the any operator comes to play.
any operator
format:
{any, [syntax1, syntax2, …]}The parser will try to consume arguments as long as one of syntax matches. If an element of the syntax fails, the operator fails.
| syntax | args | result | note |
|---|---|---|---|
| {any, [int]} | ["1", "2", "abc"] | [1, 2] | remaining: ["abc"] |
| {any, [{key, int}]} | ["1", "2"] | [{key, 1}, {key, 2}] | |
| {any, [int, {s, string}]} | ["1", "2", "abc", "3"] | [1, 2, {s, "abc"}, 3] | |
| {any, [string]} | ["1", "-o", "abc", "3"] | ["1", "-o", "abc", "3"] | even if "-o" is an option |
No matter the number of matching element, any will always succeed. If nothing matches no arguments will be consumed.
[!NOTE] Keep in mind that if the list given to
anycontains types likestringorbinary, it will consume all the remaining arguments.{any, [string, custom_type]},custom_typewill never be executed because the typestringwill always consume argument
first
format:
{first, [syntax1, syntax2, …]}The parser will return the first element of the syntax to succeed.
It'll fail if no element matches.
The following table use Args = ["a", "b", "1"]
| syntax | result | remaining |
|---|---|---|
| {first, [int]} | [1] | ["2", "a", "3", "b"] |
| {first, [{opt, int}]} | [{opt, 1}] | ["a", "3", "b"] |
| {any, [int, {b, binary}]} | [1, 2, {b, <<"a">>}, 3, {b, <<"b">>}] | [] |
| {any, [string]} | ["1", "2", "a", "3", "b"] | [] |
Custom types
Sometime, you need to perfom some operations on an argument or do more complexe verifications. This is what custom type is for.
A custom type is a function that takes a list of arguments and return the formated / checked value to the parser:
-spec fun(Args) -> {ok, Value, RemainingArgs} | Failure) where
Args :: args(),
Value :: any(),
RemainingArgs :: args(),
Failure :: any().Args: The list of arguments not yet consumed by the parserValue: The Value you want to return to the parserRemainingArgs: The list of arguments your function didn't consumedFailure: some explanation on why the function didn't accept the argument
Example 1:
Let say your script has an option -f FILE where FILE must be an existing file. In this case the type string won't be enought. You could write your own function to perform this check:
existing_file([File | RemainingArgs]) ->
case filelib:is_regular(File) of
true -> {ok, File, RemainingArgs};
_ -> {not_a_file, File}
end.To use your custom type:
Spec = #{
syntax => {any, [file]},
definitions => #{
file => erlarg:opt({"-f", "--file"}, existing_file),
existing_file => fun existing_file/1
}
}.or directly as a syntax:
Spec = {any, [{file, erlarg:opt({"-f", "--file"}, fun existing_file/1)}]}.Example 2:
In this case, your script needs to fetch the informations of a particular user from a config file with the option --consult USERS_FILE USER_ID where USERS_FILE is the file containing the users data and USER_ID is the id of the user:
get_user_config([DatabaseFile, UserID | RemainingArgs]) ->
case file:consult(DatabaseFile) of
{ok, Users} ->
case proplists:get_value(UserID, Users, not_found) of
not_found -> {user_not_found, UserID};
UserData -> {ok, UserData, RemainingArgs}
end;
Error -> {cannot_consult, DatabaseFile, Error}
end;
get_user_config(_) ->
{badarg, missing_arguments}.