Dataset v0.5.0 Dataset View Source
Datasets represent labeled tabular data.
Datasets are enumerable:
iex> Dataset.new([{:a, :b, :c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}],
...> {"one", "two", "three"})
...> |> Enum.map(&elem(&1, 2))
[:c, :C, :iii, :III]
Datasets are also collectable:
iex> for x <- 0..10, into: Dataset.empty({:n}), do: x
%Dataset{labels: {:n}, rows: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
Link to this section Summary
Functions
Return a tuple of lists containing columnar data from ds, one list
for each passed element of the column_labels list. Lists are
returned in the tuple in the same order in which they appear in
column_labels. Labels may appear more than once.
Return a dataset with no rows and labels specified by the tuple
passed as label. If label is not specified, return an empty
dataset with zero columns.
Return the result of performing an inner join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Return the result of performing a left join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Construct a new dataset. A dataset is a list of tuples. With no
arguments, an empty dataset with zero columns is constructed. Withf
one argument a dataset is constructed with the passed object
interpreted as rows and labels beginning with 0 are generated, the
number of which are determined by size of the first tuple in the
data.
Return the result of performing an outer join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Return the result of performing a right join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
Returns a dataset with each value in row i and column j transposed into row j and column i. The dataset is labelled with integer indicies beginning with zero.
Return a new dataset with columns chosen from the input dataset ds.
Return the contents of _ds as a list of maps.
Link to this section Functions
columns(ds, column_labels) View Source
Return a tuple of lists containing columnar data from ds, one list
for each passed element of the column_labels list. Lists are
returned in the tuple in the same order in which they appear in
column_labels. Labels may appear more than once.
iex> iso_countries = %Dataset{
...> labels: {:iso_country, :country_name},
...> rows: [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ]
...> }
...> Dataset.columns(iso_countries, [:iso_country, :iso_country])
{["us", "uk", "ca", "de", "nl", "sg"],
["us", "uk", "ca", "de", "nl", "sg"]}
empty(labels \\ nil) View Source
Return a dataset with no rows and labels specified by the tuple
passed as label. If label is not specified, return an empty
dataset with zero columns.
inner_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing an inner join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.inner_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [{"ca", "4"}, {"de", "4"}, {"uk", "11"}, {"us", "13"}]
}
left_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing a left join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.left_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [{"ca", "4"}, {nil, "2"}, {"de", "4"}, {"uk", "11"}, {"us", "13"}]
}
new(rows \\ [], labels \\ nil) View Source
Construct a new dataset. A dataset is a list of tuples. With no
arguments, an empty dataset with zero columns is constructed. Withf
one argument a dataset is constructed with the passed object
interpreted as rows and labels beginning with 0 are generated, the
number of which are determined by size of the first tuple in the
data.
iex> Dataset.new()
%Dataset{rows: [], labels: {}}
iex> Dataset.new([{:foo, :bar}, {:eggs, :ham}])
%Dataset{rows: [foo: :bar, eggs: :ham], labels: {0, 1}}
iex> Dataset.new([{0,0}, {1, 1}, {2, 4}, {3, 9}],
...> {:x, :x_squared})
%Dataset{labels: {:x, :x_squared}, rows: [{0, 0}, {1, 1}, {2, 4}, {3, 9}]}
outer_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing an outer join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.outer_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [
{"ca", "4"},
{nil, "2"},
{"de", "4"},
{"nl", nil},
{"sg", nil},
{"uk", "11"},
{"us", "13"}
]
}
right_join(ds1, ds2, k1, k2 \\ nil, out_labels) View Source
Return the result of performing a right join on datasets ds1 and
ds2, using k1 and k2 as the key labels on each respective
dataset. The returned dataset will contain columns for each label
specified in out_labels, which is a keyword list of the form
[left_or_right: label, ...].
iex> iso_countries =
...> Dataset.new(
...> [
...> {"us", "United States"},
...> {"uk", "United Kingdom"},
...> {"ca", "Canada"},
...> {"de", "Germany"},
...> {"nl", "Netherlands"},
...> {"sg", "Singapore"}
...> ],
...> {:iso_country, :country_name}
...> )
...>
...> country_clicks =
...> Dataset.new(
...> [
...> {"United States", "13"},
...> {"United Kingdom", "11"},
...> {"Canada", "4"},
...> {"Germany", "4"},
...> {"France", "2"}
...> ],
...> {:country_name, :clicks}
...> )
...>
...> Dataset.right_join(country_clicks, iso_countries, :country_name,
...> right: :iso_country,
...> left: :clicks
...> )
%Dataset{
labels: {:iso_country, :clicks},
rows: [
{"ca", "4"},
{"de", "4"},
{"nl", nil},
{"sg", nil},
{"uk", "11"},
{"us", "13"}
]
}
rotate(dataset) View Source
Returns a dataset with each value in row i and column j transposed into row j and column i. The dataset is labelled with integer indicies beginning with zero.
iex> Dataset.new([{:a,:b,:c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}])
...> |> Dataset.rotate()
%Dataset{
labels: {0, 1, 2, 3},
rows: [{:a, :A, :i, :I},
{:b, :B, :ii, :II},
{:c, :C, :iii, :III}]
}
select(ds, out_labels) View Source
Return a new dataset with columns chosen from the input dataset ds.
iex> Dataset.new([{:a,:b,:c},
...> {:A, :B, :C},
...> {:i, :ii, :iii},
...> {:I, :II, :III}],
...> {"first", "second", "third"})
...> |> Dataset.select(["second"])
%Dataset{rows: [{:b}, {:B}, {:ii}, {:II}], labels: {"second"}}
to_map_list(ds) View Source
Return the contents of _ds as a list of maps.