plymio_enum v0.1.0 Plymio.Enum.Transform

Building, Composing and Applying Transform Functions for Enumerables.

A transform function normally takes one argument — usually an enumerable — and applies a pipeline of discrete transforms, returning (again usually) another enumerable.

Each discrete transform is usually the name of a Stream or Enum function (e.g. :map, :filter, :group_by, etc).

A transform function tries to be as lazy as possible, using (preferring) Stream over Enum and, if possible, returning a lazy enumerable.

A macro is provided (defenumtransform/1) to define a named function from a pipeline of discrete transforms.

The companion module Plymio.Enum.Tranform.Dictionary supports a map-like dictionary of named transforms. It also supports the composition of higher level transforms from transforms in the dictionary, stand alone transforms, and/or new pipelines. Composed transforms can be saved in the dictionary.

Building a Transform Function

build/1 builds a transform function from a pipeline of discrete transforms.

Each discrete transform is (usually) the name of a function supported by Stream and/or Enum (e.g. :filter, :map, :reject, :group_by, etc), together with the arguments taken by the function.

Each discrete transform in the pipeline results in a call to Stream (or Enum when the transform is Enum-only e.g. Enum.group_by/2). The calls to Stream / Enum are then composed into a single function.

In this example, all the discrete transforms can be lazily applied (i.e. are supported by Stream) so a Stream is returned. (The stream can be realised using Enum.to_list/1):

iex> fun = [filter: fn v -> is_number(v) end,
...>        filter: fn v -> v > 0 end,
...>        map: fn v -> v * v end,
...>        map: fn v -> v + 42 end,
...>        reject: fn v -> v < 45 end,
...>        reject: fn v -> v > 50 end]
...> |> build
...> stream = [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> fun.()
...> stream |> Enum.to_list
[46]

In this example, the last transformation is Enum.group_by/2 which always returns a Map.

iex> fun = [filter: fn {_k,v} -> is_number(v) end,
...>        map: fn {k,v} -> {k,v*v} end,
...>        group_by: fn {k,_v} -> k |> to_string end]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
%{"a" => [a: 1], "b" => [b: 4], "c" => [c: 9]}

Arguments to each discrete transforms must be given is the expected order. This example includes a final Enum.reduce/2 with zero as the initial value of the accumulator.

iex> fun = [filter: fn {_k,v} -> is_number(v) end,
...>        map: fn {k,v} -> {k,v*v} end,
...>        group_by: fn {k,_v} -> k |> to_string end,
...>        reduce: [0, fn {_k,v},s -> (Keyword.values(v) |> Enum.sum) + s end]]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
14

Composing Prebuilt Transform Functions

Prebuilt transform functions can be composed just by including them in the pipeline of discrete transforms passed to build/1:

In this example a new transform function is composed from 3 separate, prebuilt transform functions and a final subpipeline ([map: fn v - 4 end]) (which is built recursively).

iex> filter_fun = [filter: [fn v -> is_number(v) end, fn v -> v > 0 end]]
...> |> build
...> mapper_fun = [map: [fn v -> v * v end, fn v -> v + 42 end]]
...> |> build
...> reject_fun = [reject: [fn v -> v < 45 end, fn v -> v > 50 end]]
...> |> build
...> fun = [filter_fun, mapper_fun, reject_fun, [map: fn v -> v - 4 end]] |> build
...> stream = [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> fun.()
...> stream |> Enum.to_list
[42]

Using Multiple Functions in Discrete Transforms

Usually multiple functions can be used in each discrete transform.

For example the first example above can be rewritten with a list of functions for each discrete transform:

iex> fun = [filter: [fn v -> is_number(v) end, fn v -> v > 0 end],
...>        map: [fn v -> v * v end, fn v -> v + 42 end],
...>        reject: [fn v -> v < 45 end, fn v -> v > 50 end]]
...> |> build
...> stream = [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> fun.()
...> stream |> Enum.to_list
[46]

Discrete transforms with multiple arguments (e.g. Stream.map_every/3) can also use multiple functions. In this example every other element in the enumerable is mapped. Note the two functions in a list.

Note: Stream.map_every/3 always maps the zeroth element of the enumerable.

iex> fun = [map_every: [2, [fn v -> v * v end, fn v -> v + 42 end]]]
...> |> build
...> stream = [1, 2, 3, 4, 5] |> fun.()
...> stream |> Enum.to_list
[43, 2, 51, 4, 67]

When multiple functions are given, they have to be “combined” according to their purpose (e.g. filter):

Combining Multiple Functions: filter

Multiple filter-type functions AND together the results of applying each one to the value being tested (using Enum.all?/2) e.g.

iex> fun = fn value ->
...>   [fn v -> is_number(v) end, fn v -> v > 0 end]
...>   |> Enum.all?(fn f -> f.(value) end)
...> end
...> [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> Enum.filter(fun)
[1, 2, 3]

Combining Multiple Functions: reject

Multiple reject-type functions are OR-ed together using Enum.any?/2 e.g.

iex> fun = fn value ->
...>   [fn v -> v < 45 end, fn v -> v > 50 end]
...>   |> Enum.any?(fn f -> f.(value) end)
...> end
...> [43, 46, 51] |> Enum.reject(fun)
[46]

Combining Multiple Functions: map

Multiple map-type functions are combined using Enum.reduce/2 e.g.

iex> fun = fn value ->
...>    [fn v -> v * v end, fn v -> v + 42 end]
...>   |> Enum.reduce(value, fn f,v -> f.(v) end)
...> end
...> [1, 2, 3] |> Enum.map(fun)
[43, 46, 51]

Combining Multiple Functions: reduce

reduce functions are normally arity 2 taking the current value from the enumerable, togther with the accumulator.

This constraint is relaxed when multiple functions are used and each function can be arity 1 or 2. An arity 1 is passed just the result of the previous function, no accumulator (just like a map). The code to combine multiple functions looks something like this.

Note for each value of the enumerable, each function is passed the same accumulator.

iex> fun1 = fn v, s -> v + s end
...> fun2 = fn v -> v - 42 end
...> fun3 = fn v, s -> v * s end
...> fun = fn value, acc ->
...>   [fun1, fun2, fun3]
...>   |> Enum.reduce(value, fn
...>        f,v when is_function(f, 2) -> f.(v,acc)
...>        f,v when is_function(f, 1) -> f.(v)
...>   end)
...> end
...> [1, 2, 3] |> Enum.reduce(7, fun)
4375094500

Discrete Transform Forms

In the examples above the pipeline of discrete transforms was a Keyword where the keys were Stream and/or Enum functions, and the values their additional arguments (without the enumerable).

More generally the definition of each discrete transformation can have a number of forms.

Its worth stressing that the discrete transform pipeline is always a List but not always a Keyword.

Discrete Transform Forms: {name,args} when is_atom(name)

This is the form used so far. The name (an Atom) must be a function of Stream or Enum.

iex> fun = [filter: fn {_k,v} -> is_number(v) end]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.() |> Enum.to_list
[a: 1, b: 2, c: 3]

When the discrete transform doesn’t take any other arguments other than the enumerable, the args in the 2tuple can be nil or an empty list.

iex> fun = [count: nil]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
4

Discrete Transform Forms: name when is_atom(name)

When name is an Atom, it must be a function of Stream or Enum that only takes an enumerable; no other arguments.

iex> fun = [:count]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
4

Using this form means the other discrete transforms must be e.g. {name,args} else the Elixir compiler will complain since the pipeline is no longer a Keyword:

iex> fun = [{:map, fn {_k,v} -> v*v end}, :sum]
...> |> build
...> [a: 1, b: 2, c: 3] |> fun.()
14

Discrete Transform Forms: {mod,fun_name,args}

The general purpose MFA (module,function,arguments) form used with Kernel.apply/3 is supported. The enumerable is prepended to the arguments ([enum | arguments]).

This example uses the MFA form of [map: &(&1)]

iex> fun = [{Stream, :map, [&(&1)]}] |> build
iex> [a: 1, b: 2, c: 3, d: :atom] |> fun.() |> Enum.to_list
[a: 1, b: 2, c: 3, d: :atom]

However, an MFA can call any module and function, not just Stream or Enum ones. For example List.duplicate/2 is used to create an enumerable to feed the map squaring each value, with a final :sum to add up all the values.

iex> fun = [{List, :duplicate, [3]}, {:map, fn v -> v*v end}, :sum]
...> |> build
...> 42 |> fun.()
5292

Here is another example combining Stream / Enum 2tuples with an MFA. Note though, the result of the filter, map and reject discrete transforms will be a Stream. List functions require a list as input, hence the :to_list in the transform pipeline just before the insert_at.

Since the pipeline definition is no longer a Keyword, it must use the explicit 2tuple syntax.

iex> fun = [{:filter, [fn {_k,v} -> is_number(v) end, fn {_k,v} -> v > 0 end]},
...>        {:map, [fn {k,v} -> {k, v * v} end, fn {k,v} -> {k, v + 42} end]},
...>        {:reject, [fn {_k,v} -> v < 45 end, fn {_k,v} -> v > 50 end]},
...>        :to_list,
...>        {List, :insert_at, [2, {:e, "five"}]}]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.() |> Enum.to_list
[b: 46, e: "five"]

Discrete Transform Forms: fun when is_function(fun)

The transform can also be a function and is passed the result of the previous transforms:

iex> fun = [{:filter, [fn {_k,v} -> is_number(v) end, fn {_k,v} -> v > 0 end]},
...>        {:map, [fn {k,v} -> {k, v * v} end, fn {k,v} -> {k, v + 42} end]},
...>        # a transform function
...>        fn enum -> enum |> Stream.map(fn {k,v} -> {k |> to_string, v} end) end,
...>        {:into, %{}}]
...> |> build
...> [a: 1, b: 2, c: 3, d: :atom] |> fun.()
%{"a" => 43, "b" => 46, "c" => 51}

Applying a Transform Function

transform/2 is a convenience function taking an enumerable and either a transform function or pipeline of discrete transforms.

If a pipeline is given, the transform function is built on-the-fly (using build/1), used to transform the enumerable and then discarded. If the transform is expected to be used many times, it is more efficient to build the transform function first.

Here the transform function is built on-the-fly

iex> pipeline = [{:map, fn {_k,v} -> v*v end}, :sum]
...> [a: 1, b: 2, c: 3] |> transform(pipeline)
14

Here the transform function is prebuilt and passed to transform/2

iex> fun = [{:map, fn {_k,v} -> v*v end}, :sum] |> build
...> [a: 1, b: 2, c: 3] |> transform(fun)
14

Frequently the result of transform/2 is lazy

iex> fun = [{:map, fn {_k,v} -> v*v end}] |> build
...> result = [a: 1, b: 2, c: 3] |> transform(fun)
...> match?(%Stream{}, result)
true

Plymio.Enum.Transform.Dictionary provides support for easily applying prebuilt transforms.

Realising the Result of a Transformed Function

realise/2 is another convenience function taking an enumerable and either a transform function or pipeline of discrete transforms.

transform/2 is used to apply the transformation, and if the result is a lazy enumerable, it is realised (using Enum.to_list/1).

Here the transform function is prebuilt and passed to realise/2. Note the enumerable is lazy.

iex> fun = [{:map, fn {_k,v} -> v*v end}, :sum] |> build
...> [a: 1, b: 2, c: 3] |> Stream.map(&(&1)) |> realise(fun)
14

Defining a Named Transform Function

Although the focus of this module is to create transform functions at run time, it is possible to define a named transform function, using a pipeline of discrete transforms.

The defenumtransform/1 macro is quite simple, its takes the name of the function together with the pipeline as the argument:

defenumtransform named_transform1([{:map, fn {_k,v} -> v*v end}, :sum])

The named function can be used as expected:

iex> [a: 1, b: 2, c: 3] |> Stream.map(&(&1)) |> realise(&named_transform1/1)
14

Notes

each

Stream.each/2 is preferred but it returns the original enumerable whereas Enum.each/2 returns :ok.

iex> fun = [each: fn {_k,v} -> v*v end] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
[a: 1, b: 2, c: 3]

Here the MFA form of a discrete transform is used to explicitly call Enum.each/2:

iex> fun = [{Enum, :each, [fn {_k,v} -> v*v end]}] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
:ok

into

Enum.into/2 is preferred over Stream.into/2 as the latter “loses” the type of the collectable when it is realised:

iex> fun = [into: %{}] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
%{a: 1, b: 2, c: 3}

Here the MFA form of a discrete transform is used to explicitly call Stream.into/2:

iex> fun = [{Stream, :into, [%{}]}] |> build
...> [a: 1, b: 2, c: 3] |> realise(fun)
[a: 1, b: 2, c: 3]

Summary

Functions

Builds a transform function when given a discrete transform pipeline

The defenumtransform/1 macro creates a named transform function from a pipeline of discrete transforms

transform/2 is another convenience function whose arguments are an enumerable together with a transform_function or transform_pipeline

transform/2 is a convenience function whose arguments are an enumerable together with a transform_function or transform_pipeline

Types

discrete_args()
discrete_args() :: nil | any | [any]
discrete_function()
discrete_function() :: (any -> any)
discrete_function_name()
discrete_function_name() :: atom
discrete_module()
discrete_module() :: atom
discrete_module_function_name_tuple()
discrete_module_function_name_tuple() :: {discrete_module, discrete_function_name}
enum()
enum() :: Enumerable.t
transform_function()
transform_function() :: nil | (any -> any)
transform_pipeline()
transform_pipeline() :: [discrete_transform]

Functions

Builds a transform function when given a discrete transform pipeline.

See examples above.

defenumtransform(args) (macro)
defenumtransform(term, transform_pipeline) :: Macro.t

The defenumtransform/1 macro creates a named transform function from a pipeline of discrete transforms.

Examples

This example shows the definition of a named transform function called clean_the_data that applies a pipeline of filters, maps and rejects and finally to_list to realise the result of the previous transforms.

defenumtransform clean_the_data(
  filter: [fn v -> is_number(v) end, fn v -> v > 0 end],
  map: [fn v -> v * v end, fn v -> v + 42 end],
  reject: [fn v -> v < 45 end, fn v -> v > 50 end,
  to_list: nil])

iex> [-1, make_ref(), 1, :atom, 2, "string", 3, &(&1)] |> clean_the_data
[46]
realise(enum, opts \\ [])

transform/2 is another convenience function whose arguments are an enumerable together with a transform_function or transform_pipeline.

transform/2 is used to generate the result but, if the result is a lazy enum,erable (e.g. Stream), it is realised recursively.

Examples

Here a transform_pipeline is passed forcing the transform_function to be built on the fly:

iex> [a: 1, b: 2, c: 3, d: :atom]
...> |> realise(
...>       filter: [fn {_k,v} -> is_number(v) end, fn {_k,v} -> v > 0 end],
...>       map: [fn {k,v} -> {k, v * v} end, fn {k,v} -> {k, v + 42} end],
...>       reject: [fn {_k,v} -> v < 45 end, fn {_k,v} -> v > 50 end])
[b: 46]
transform(enum, opts \\ [])

transform/2 is a convenience function whose arguments are an enumerable together with a transform_function or transform_pipeline.

If a transform_pipeline is given, a transform_function is built using build/1. (This is not optimal if the same call will be made repeatedly.)

The transform_function (either passed as an argument or built on the fly) is then applied to the enumerable.

The result is often a lazy enumerable (e.g. Stream), but not always.

Examples

Note, in this example the final discrete transform group_by produces a Map.

Here a transform_pipeline is passed forcing the transform_function to be built on the fly:

iex> [a: 1, b: 2, c: 3, d: :atom]
...> |> transform(
...>      filter: fn {_k,v} -> is_number(v) end,
...>      map: fn {k,v} -> {k,v*v} end,
...>      group_by: fn {k,_v} -> k |> to_string end)
%{"a" => [a: 1], "b" => [b: 4], "c" => [c: 9]}

In this example, the apply is passed a pre-built transform_function:

iex> fun = [filter: fn {_k,v} -> is_number(v) end,
...>        map: fn {k,v} -> {k,v*v} end,
...>        group_by: fn {k,_v} -> k |> to_string end]
...> |> build
iex> [a: 1, b: 2, c: 3, d: :atom] |> transform(fun)
%{"a" => [a: 1], "b" => [b: 4], "c" => [c: 9]}