CrucibleIR.Experiment (CrucibleIR v0.2.1)
View SourceTop-level experiment definition for Crucible ML reliability experiments.
An Experiment defines a complete ML reliability experiment including
the backend to test, the evaluation pipeline, datasets, reliability
mechanisms, and output specifications.
Required Fields
:id- Unique experiment identifier:backend- The LLM backend to evaluate (BackendRef):pipeline- List of processing stages (StageDef)
Optional Fields
:description- Human-readable experiment description:owner- Experiment owner/creator:tags- List of tags for categorization:metadata- Additional experiment metadata:dataset- Dataset reference for evaluation:reliability- Reliability configurations (ensemble, hedging, etc.):outputs- Output specifications:created_at- Experiment creation timestamp:updated_at- Last update timestamp:experiment_type- Type of experiment (evaluation, training, comparison, ablation):model_version- Model version being evaluated:training_config- Training configuration for training experiments:baseline- Baseline model reference for comparison experiments
Examples
iex> exp = %CrucibleIR.Experiment{
...> id: :my_experiment,
...> backend: %CrucibleIR.BackendRef{id: :gpt4},
...> pipeline: [%CrucibleIR.StageDef{name: :inference}]
...> }
iex> exp.id
:my_experiment
iex> exp = %CrucibleIR.Experiment{
...> id: :full_exp,
...> backend: %CrucibleIR.BackendRef{id: :gpt4},
...> pipeline: [%CrucibleIR.StageDef{name: :run}],
...> dataset: %CrucibleIR.DatasetRef{name: :mmlu},
...> reliability: %CrucibleIR.Reliability.Config{
...> stats: %CrucibleIR.Reliability.Stats{alpha: 0.01}
...> }
...> }
iex> exp.reliability.stats.alpha
0.01
Summary
Types
@type experiment_type() :: :evaluation | :training | :comparison | :ablation | atom()
@type t() :: %CrucibleIR.Experiment{ backend: CrucibleIR.BackendRef.t(), baseline: CrucibleIR.ModelRef.t() | nil, created_at: DateTime.t() | nil, dataset: CrucibleIR.DatasetRef.t() | nil, description: String.t() | nil, experiment_type: experiment_type() | nil, id: atom(), metadata: map() | nil, model_version: CrucibleIR.ModelVersion.t() | nil, outputs: [CrucibleIR.OutputSpec.t()] | nil, owner: String.t() | nil, pipeline: [CrucibleIR.StageDef.t()], reliability: CrucibleIR.Reliability.Config.t() | nil, tags: [atom()] | nil, training_config: CrucibleIR.Training.Config.t() | nil, updated_at: DateTime.t() | nil }