plymio_enum v0.1.0 Plymio.Enum.Transform
Building, Composing and Applying Transform Functions for Enumerables.
A transform function normally takes one argument — usually an
enumerable — and applies a pipeline of discrete transforms, returning (again usually)
another enumerable.
Each discrete transform is usually the name of a Stream or Enum function (e.g. :map, :filter, :group_by, etc).
A transform function tries to be as lazy as possible, using (preferring) Stream over Enum and, if possible, returning a lazy enumerable.
A macro is provided (defenumtransform/1) to define a named function from a pipeline of discrete transforms.
The companion module
Plymio.Enum.Tranform.Dictionarysupports a map-like dictionary of named transforms. It also supports the composition of higher level transforms from transforms in the dictionary, stand alone transforms, and/or new pipelines. Composed transforms can be saved in the dictionary.
Building a Transform Function
build/1 builds a transform function from a pipeline of discrete transforms.
Each discrete transform is (usually) the name of a function
supported by Stream and/or Enum (e.g. :filter, :map, :reject,
:group_by, etc), together with the arguments taken by the function.
Each discrete transform in the pipeline results in a call to
Stream (or Enum when the transform is Enum-only e.g.
Enum.group_by/2). The calls to Stream / Enum are then
composed into a single function.
In this example, all the discrete transforms can be lazily applied
(i.e. are supported by Stream) so a Stream is returned. (The
stream can be realised using Enum.to_list/1):
iex> fun = [filter: fn v -> is_number(v) end,
...> filter: fn v -> v > 0 end,
...> map: fn v -> v * v end,
...> map: fn v -> v + 42 end,
...> reject: fn v -> v < 45 end,
...> reject: fn v -> v > 50 end]
...> |> build
...> stream = [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> fun.()
...> stream |> Enum.to_list
[46]
In this example, the last transformation is Enum.group_by/2 which
always returns a Map.
iex> fun = [filter: fn {_k,v} -> is_number(v) end,
...> map: fn {k,v} -> {k,v*v} end,
...> group_by: fn {k,_v} -> k |> to_string end]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
%{"a" => [a: 1], "b" => [b: 4], "c" => [c: 9]}
Arguments to each discrete transforms must be given is the expected
order. This example includes a final Enum.reduce/2 with zero as
the initial value of the accumulator.
iex> fun = [filter: fn {_k,v} -> is_number(v) end,
...> map: fn {k,v} -> {k,v*v} end,
...> group_by: fn {k,_v} -> k |> to_string end,
...> reduce: [0, fn {_k,v},s -> (Keyword.values(v) |> Enum.sum) + s end]]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
14
Composing Prebuilt Transform Functions
Prebuilt transform functions can be composed just by including them in the pipeline of discrete transforms passed to build/1:
In this example a new transform function is composed from 3 separate, prebuilt transform functions and a final subpipeline ([map: fn v - 4 end]) (which is built recursively).
iex> filter_fun = [filter: [fn v -> is_number(v) end, fn v -> v > 0 end]]
...> |> build
...> mapper_fun = [map: [fn v -> v * v end, fn v -> v + 42 end]]
...> |> build
...> reject_fun = [reject: [fn v -> v < 45 end, fn v -> v > 50 end]]
...> |> build
...> fun = [filter_fun, mapper_fun, reject_fun, [map: fn v -> v - 4 end]] |> build
...> stream = [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> fun.()
...> stream |> Enum.to_list
[42]
Using Multiple Functions in Discrete Transforms
Usually multiple functions can be used in each discrete transform.
For example the first example above can be rewritten with a list of functions for each discrete transform:
iex> fun = [filter: [fn v -> is_number(v) end, fn v -> v > 0 end],
...> map: [fn v -> v * v end, fn v -> v + 42 end],
...> reject: [fn v -> v < 45 end, fn v -> v > 50 end]]
...> |> build
...> stream = [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> fun.()
...> stream |> Enum.to_list
[46]
Discrete transforms with multiple arguments (e.g. Stream.map_every/3) can also use multiple functions. In this example every other element in the enumerable is mapped. Note the two functions in a list.
Note:
Stream.map_every/3always maps the zeroth element of the enumerable.
iex> fun = [map_every: [2, [fn v -> v * v end, fn v -> v + 42 end]]]
...> |> build
...> stream = [1, 2, 3, 4, 5] |> fun.()
...> stream |> Enum.to_list
[43, 2, 51, 4, 67]
When multiple functions are given, they have to be “combined” according to their purpose (e.g. filter):
Combining Multiple Functions: filter
Multiple filter-type functions AND together the results of applying each one to the value being tested (using Enum.all?/2) e.g.
iex> fun = fn value ->
...> [fn v -> is_number(v) end, fn v -> v > 0 end]
...> |> Enum.all?(fn f -> f.(value) end)
...> end
...> [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> Enum.filter(fun)
[1, 2, 3]
Combining Multiple Functions: reject
Multiple reject-type functions are OR-ed together using Enum.any?/2 e.g.
iex> fun = fn value ->
...> [fn v -> v < 45 end, fn v -> v > 50 end]
...> |> Enum.any?(fn f -> f.(value) end)
...> end
...> [43, 46, 51] |> Enum.reject(fun)
[46]
Combining Multiple Functions: map
Multiple map-type functions are combined using Enum.reduce/2 e.g.
iex> fun = fn value ->
...> [fn v -> v * v end, fn v -> v + 42 end]
...> |> Enum.reduce(value, fn f,v -> f.(v) end)
...> end
...> [1, 2, 3] |> Enum.map(fun)
[43, 46, 51]
Combining Multiple Functions: reduce
reduce functions are normally arity 2 taking the current value from the enumerable, togther with the accumulator.
This constraint is relaxed when multiple functions are used and each function can be arity 1 or 2. An arity 1 is passed just the result of the previous function, no accumulator (just like a map). The code to combine multiple functions looks something like this.
Note for each value of the enumerable, each function is passed the same accumulator.
iex> fun1 = fn v, s -> v + s end
...> fun2 = fn v -> v - 42 end
...> fun3 = fn v, s -> v * s end
...> fun = fn value, acc ->
...> [fun1, fun2, fun3]
...> |> Enum.reduce(value, fn
...> f,v when is_function(f, 2) -> f.(v,acc)
...> f,v when is_function(f, 1) -> f.(v)
...> end)
...> end
...> [1, 2, 3] |> Enum.reduce(7, fun)
4375094500
Discrete Transform Forms
In the examples above the pipeline of discrete transforms was a Keyword where the keys were Stream and/or Enum functions, and the values their additional arguments (without the enumerable).
More generally the definition of each discrete transformation can have a number of forms.
Its worth stressing that the discrete transform pipeline is always a List but not always a Keyword.
Discrete Transform Forms: {name,args} when is_atom(name)
This is the form used so far. The name (an Atom) must be a function of Stream or Enum.
iex> fun = [filter: fn {_k,v} -> is_number(v) end]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.() |> Enum.to_list
[a: 1, b: 2, c: 3]
When the discrete transform doesn’t take any other arguments other than the enumerable, the args in the 2tuple can be nil or an empty list.
iex> fun = [count: nil]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
4
Discrete Transform Forms: name when is_atom(name)
When name is an Atom, it must be a function of Stream or Enum that only takes an enumerable; no other arguments.
iex> fun = [:count]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
4
Using this form means the other discrete transforms must be e.g. {name,args} else the Elixir compiler will complain since the pipeline is no longer a Keyword:
iex> fun = [{:map, fn {_k,v} -> v*v end}, :sum]
...> |> build
...> [a: 1, b: 2, c: 3] |> fun.()
14
Discrete Transform Forms: {mod,fun_name,args}
The general purpose MFA (module,function,arguments) form used with Kernel.apply/3 is supported. The enumerable is prepended to the arguments ([enum | arguments]).
This example uses the MFA form of [map: &(&1)]
iex> fun = [{Stream, :map, [&(&1)]}] |> build
iex> [a: 1, b: 2, c: 3, d: :atom] |> fun.() |> Enum.to_list
[a: 1, b: 2, c: 3, d: :atom]
However, an MFA can call any module and function, not just Stream or Enum ones. For example List.duplicate/2 is used to create an enumerable to feed the map squaring each value, with a final :sum to add up all the values.
iex> fun = [{List, :duplicate, [3]}, {:map, fn v -> v*v end}, :sum]
...> |> build
...> 42 |> fun.()
5292
Here is another example combining Stream / Enum 2tuples with an
MFA. Note though, the result of the filter, map and reject
discrete transforms will be a Stream. List functions require a
list as input, hence the :to_list in the transform pipeline just
before the insert_at.
Since the pipeline definition is no longer a
Keyword, it must use the explicit 2tuple syntax.
iex> fun = [{:filter, [fn {_k,v} -> is_number(v) end, fn {_k,v} -> v > 0 end]},
...> {:map, [fn {k,v} -> {k, v * v} end, fn {k,v} -> {k, v + 42} end]},
...> {:reject, [fn {_k,v} -> v < 45 end, fn {_k,v} -> v > 50 end]},
...> :to_list,
...> {List, :insert_at, [2, {:e, "five"}]}]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.() |> Enum.to_list
[b: 46, e: "five"]
Discrete Transform Forms: fun when is_function(fun)
The transform can also be a function and is passed the result of the previous transforms:
iex> fun = [{:filter, [fn {_k,v} -> is_number(v) end, fn {_k,v} -> v > 0 end]},
...> {:map, [fn {k,v} -> {k, v * v} end, fn {k,v} -> {k, v + 42} end]},
...> # a transform function
...> fn enum -> enum |> Stream.map(fn {k,v} -> {k |> to_string, v} end) end,
...> {:into, %{}}]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
%{"a" => 43, "b" => 46, "c" => 51}
Applying a Transform Function
transform/2 is a convenience function taking an enumerable and either a transform function or pipeline of discrete transforms.
If a pipeline is given, the transform function is built
on-the-fly (using build/1), used to transform the enumerable and then discarded. If the transform
is expected to be used many times, it is more efficient to build the
transform function first.
Here the transform function is built on-the-fly
iex> pipeline = [{:map, fn {_k,v} -> v*v end}, :sum]
...> [a: 1, b: 2, c: 3] |> transform(pipeline)
14
Here the transform function is prebuilt and passed to transform/2
iex> fun = [{:map, fn {_k,v} -> v*v end}, :sum] |> build
...> [a: 1, b: 2, c: 3] |> transform(fun)
14
Frequently the result of transform/2 is lazy
iex> fun = [{:map, fn {_k,v} -> v*v end}] |> build
...> result = [a: 1, b: 2, c: 3] |> transform(fun)
...> match?(%Stream{}, result)
true
Plymio.Enum.Transform.Dictionaryprovides support for easily applying prebuilt transforms.
Realising the Result of a Transformed Function
realise/2 is another convenience function taking an enumerable and either a transform function or pipeline of discrete transforms.
transform/2 is used to apply the transformation, and if the result is a lazy enumerable, it is realised (using Enum.to_list/1).
Here the transform function is prebuilt and passed to realise/2. Note the enumerable is lazy.
iex> fun = [{:map, fn {_k,v} -> v*v end}, :sum] |> build
...> [a: 1, b: 2, c: 3] |> Stream.map(&(&1)) |> realise(fun)
14
Defining a Named Transform Function
Although the focus of this module is to create transform functions at run time, it is possible to define a named transform function, using a pipeline of discrete transforms.
The defenumtransform/1 macro is quite simple, its takes the name of the function together with the pipeline as the argument:
defenumtransform named_transform1([{:map, fn {_k,v} -> v*v end}, :sum])
The named function can be used as expected:
iex> [a: 1, b: 2, c: 3] |> Stream.map(&(&1)) |> realise(&named_transform1/1)
14
Notes
each
Stream.each/2 is preferred but it returns the original enumerable whereas Enum.each/2 returns :ok.
iex> fun = [each: fn {_k,v} -> v*v end] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
[a: 1, b: 2, c: 3]
Here the MFA form of a discrete transform is used to explicitly call Enum.each/2:
iex> fun = [{Enum, :each, [fn {_k,v} -> v*v end]}] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
:ok
into
Enum.into/2 is preferred over Stream.into/2 as the latter “loses” the type of the collectable when it is realised:
iex> fun = [into: %{}] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
%{a: 1, b: 2, c: 3}
Here the MFA form of a discrete transform is used to explicitly call Stream.into/2:
iex> fun = [{Stream, :into, [%{}]}] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
[a: 1, b: 2, c: 3]
Summary
Functions
Builds a transform function when given a discrete transform pipeline
The defenumtransform/1 macro creates a named transform function
from a pipeline of discrete transforms
transform/2 is another convenience function whose arguments
are an enumerable together with a transform_function or
transform_pipeline
transform/2 is a convenience function whose arguments
are an enumerable together with a transform_function or
transform_pipeline
Types
discrete_module_function_name_tuple() :: {discrete_module, discrete_function_name}
Functions
Builds a transform function when given a discrete transform pipeline.
See examples above.
The defenumtransform/1 macro creates a named transform function
from a pipeline of discrete transforms.
Examples
This example shows the definition of a named transform function
called clean_the_data that applies a pipeline of filters, maps
and rejects and finally to_list to realise the result of the
previous transforms.
defenumtransform clean_the_data(
filter: [fn v -> is_number(v) end, fn v -> v > 0 end],
map: [fn v -> v * v end, fn v -> v + 42 end],
reject: [fn v -> v < 45 end, fn v -> v > 50 end,
to_list: nil])
iex> [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> clean_the_data
[46]
transform/2 is another convenience function whose arguments
are an enumerable together with a transform_function or
transform_pipeline.
transform/2 is used to generate the result but, if the result is a lazy enum,erable (e.g. Stream), it is realised recursively.
Examples
Here a transform_pipeline is passed forcing the
transform_function to be built on the fly:
iex> [a: 1, b: 2, c: 3, d: :atom]
...> |> realise(
...> filter: [fn {_k,v} -> is_number(v) end, fn {_k,v} -> v > 0 end],
...> map: [fn {k,v} -> {k, v * v} end, fn {k,v} -> {k, v + 42} end],
...> reject: [fn {_k,v} -> v < 45 end, fn {_k,v} -> v > 50 end])
[b: 46]
transform/2 is a convenience function whose arguments
are an enumerable together with a transform_function or
transform_pipeline.
If a transform_pipeline is given, a transform_function is built
using build/1. (This is not optimal if the same call will be made repeatedly.)
The transform_function (either passed as an argument or built on the fly) is
then applied to the enumerable.
The result is often a lazy enumerable (e.g. Stream), but not always.
Examples
Note, in this example the final discrete transform
group_byproduces aMap.
Here a transform_pipeline is passed forcing the transform_function to be built on the fly:
iex> [a: 1, b: 2, c: 3, d: :atom]
...> |> transform(
...> filter: fn {_k,v} -> is_number(v) end,
...> map: fn {k,v} -> {k,v*v} end,
...> group_by: fn {k,_v} -> k |> to_string end)
%{"a" => [a: 1], "b" => [b: 4], "c" => [c: 9]}
In this example, the apply is passed a pre-built transform_function:
iex> fun = [filter: fn {_k,v} -> is_number(v) end,
...> map: fn {k,v} -> {k,v*v} end,
...> group_by: fn {k,_v} -> k |> to_string end]
...> |> build
iex> [a: 1, b: 2, c: 3, d: :atom] |> transform(fun)
%{"a" => [a: 1], "b" => [b: 4], "c" => [c: 9]}