CrucibleIR.DatasetRef (CrucibleIR v0.2.1)
View SourceReference to a dataset to be used in an experiment.
A DatasetRef points to a dataset from a specific provider (like crucible_datasets),
with a specific split (like :train or :test), and optional configuration.
Fields
:provider- The dataset provider (default::crucible_datasets):name- The dataset name (required):split- The dataset split to use (default::train):options- Additional dataset-specific options:version- Dataset version:format- Data format (parquet, csv, jsonl, arrow):schema- Expected schema
Examples
iex> ref = %CrucibleIR.DatasetRef{name: :mmlu}
iex> ref.provider
:crucible_datasets
iex> ref = %CrucibleIR.DatasetRef{name: :mmlu, split: :test}
iex> ref.split
:test
iex> ref = %CrucibleIR.DatasetRef{name: :custom, provider: :huggingface, options: %{limit: 100}}
iex> ref.options
%{limit: 100}