ElixirDatasets.Filter (ElixirDatasets v0.1.0)
View SourceFunctions for filtering dataset files by configuration and split.
Summary
Functions
Filters repository files by configuration name and split.
Filters files by configuration name.
Filters files by split name.
Functions
Filters repository files by configuration name and split.
Parameters
repo_files- map of files from repository (%{filename => etag})name- optional configuration name to filter bysplit- optional split name to filter by (e.g., "train", "test")
Returns
{:ok, filtered_files} where filtered_files is a map of matching files.
Examples
iex> files = %{"train.csv" => nil, "test.csv" => nil}
iex> ElixirDatasets.Filter.by_config_and_split(files, nil, "train")
{:ok, %{"train.csv" => nil}}
Filters files by configuration name.
If config_name is nil, returns all files unchanged.
Otherwise, returns only files whose path contains the config name.
Parameters
repo_files- map or list of filesconfig_name- optional configuration name to filter by
Returns
Filtered files in the same format as input (map or list).
Filters files by split name.
If split is nil, returns all files unchanged.
Otherwise, returns only files whose basename (without extension) contains the split name.
Parameters
repo_files- map or list of filessplit- optional split name to filter by (e.g., "train", "test", "validation")
Returns
Filtered files in the same format as input (map or list).
Examples
iex> files = %{"train.csv" => nil, "test.csv" => nil, "validation.csv" => nil}
iex> ElixirDatasets.Filter.by_split(files, "train")
%{"train.csv" => nil}