Dataset v0.4.0 Dataset View Source
Datasets represent labeled tabular data.
Datasets are enumerable:
iex> Dataset.new([{:a, :b, :c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}],
...> {"one", "two", "three"})
...> |> Enum.map(&elem(&1, 2))
[:c, :C, :iii, :III]
Datasets are also collectable:
iex> for x <- 0..10, into: Dataset.empty({:n}), do: x
%Dataset{labels: {:n}, rows: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
Link to this section Summary
Functions
Return a dataset with no rows and labels specified by the tuple
passed as label. If label is not specified, return an empty
dataset with zero columns.
Return the result of performing an inner join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Return the result of performing a left join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Construct a new dataset. A dataset is a list of tuples. With no
arguments, an empty dataset with zero columns is constructed. Withf
one argument a dataset is constructed with the passed object
interpreted as rows and labels beginning with 0 are generated, the
number of which are determined by size of the first tuple in the
data.
Return the result of performing an outer join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Return the result of performing a right join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Returns a dataset with each value in row i and column j transposed into row j and column i. The dataset is labelled with integer indicies beginning with zero.
Return a new dataset with columns chosen from the input dataset ds.
Return the contents of _ds as a list of maps.
Link to this section Functions
empty(labels \\ nil) View Source
Return a dataset with no rows and labels specified by the tuple
passed as label. If label is not specified, return an empty
dataset with zero columns.
inner_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing an inner join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.inner_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [{"ca", "4"}, {"de", "4"}, {"uk", "11"}, {"us", "13"}]
}
left_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing a left join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.left_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [{"ca", "4"}, {nil, "2"}, {"de", "4"}, {"uk", "11"}, {"us", "13"}]
}
new(rows \\ [], labels \\ nil) View Source
Construct a new dataset. A dataset is a list of tuples. With no
arguments, an empty dataset with zero columns is constructed. Withf
one argument a dataset is constructed with the passed object
interpreted as rows and labels beginning with 0 are generated, the
number of which are determined by size of the first tuple in the
data.
iex> Dataset.new()
%Dataset{rows: [], labels: {}}
iex> Dataset.new([{:foo, :bar}, {:eggs, :ham}])
%Dataset{rows: [foo: :bar, eggs: :ham], labels: {0, 1}}
iex> Dataset.new([{0,0}, {1, 1}, {2, 4}, {3, 9}],
...> {:x, :x_squared})
%Dataset{labels: {:x, :x_squared}, rows: [{0, 0}, {1, 1}, {2, 4}, {3, 9}]}
outer_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing an outer join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.outer_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [
{"ca", "4"},
{nil, "2"},
{"de", "4"},
{"nl", nil},
{"sg", nil},
{"uk", "11"},
{"us", "13"}
]
}
right_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing a right join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.right_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [
{"ca", "4"},
{"de", "4"},
{"nl", nil},
{"sg", nil},
{"uk", "11"},
{"us", "13"}
]
}
rotate(dataset) View Source
Returns a dataset with each value in row i and column j transposed into row j and column i. The dataset is labelled with integer indicies beginning with zero.
iex> Dataset.new([{:a,:b,:c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}])
...> |> Dataset.rotate()
%Dataset{
labels: {0, 1, 2, 3},
rows: [{:a, :A, :i, :I},
{:b, :B, :ii, :II},
{:c, :C, :iii, :III}]
}
select(ds, out_labels) View Source
Return a new dataset with columns chosen from the input dataset ds.
iex> Dataset.new([{:a,:b,:c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}],
...> {"first", "second", "third"})
...> |> Dataset.select(["second"])
%Dataset{rows: [{:b}, {:B}, {:ii}, {:II}], labels: {"second"}}
to_map_list(ds) View Source
Return the contents of _ds as a list of maps.