Dsxir.Optimizer.BootstrapFewShot (dsxir v0.1.0)

Copy Markdown

Two-phase optimizer: slot labeled demos from the trainset (phase 1), then augment with bootstrapped demos captured from successful traces (phase 2).

Phases

  1. Labeled. Up to :max_labeled_demos examples are picked from the trainset (uniform random; deterministic-by-hash when :deterministic is set). Each chosen example is slotted as %Dsxir.Demo{kind: :labeled} only into predictors whose declared input + output fields the demo's data keys cover; non-matching predictors get no labeled demo from that example. No LM call.

  2. Bootstrap. For each round in 1..max_rounds, the trainset is walked example-by-example. For each example, the program is run inside a Dsxir.with_trace/1 frame with per-call opts seeded for diversity (temperature: cfg.diversity_temperature, cache: false, plus a per-round per-example nonce). When the metric coerces to >= :threshold, each trace entry is pushed into the matching predictor's demos_pool as %Dsxir.Demo{kind: :bootstrapped, source: %{round: R, example_index: I}} until :max_bootstrapped_demos is reached.

Diversity is delivered by pushing a Dsxir.Settings.context/2 frame that swaps the resolved :lm config tuple with one carrying the diversity keywords. The LM dispatcher reads :lm from settings and merges per-call opts on top, so the temperature lever reaches the wire protocol.

Options

  • :max_labeled_demos (default 4) — cap on phase 1 demos per predictor.
  • :max_bootstrapped_demos (default 4) — cap on phase 2 demos per predictor.
  • :max_rounds (default 1) — number of bootstrap passes over the trainset.
  • :threshold (default 1.0) — coerced metric must meet or exceed this to keep the trace. Accepted threshold types: true | false | integer() | float(). Booleans coerce to 1.0 / 0.0. Other values raise FunctionClauseError during option parsing — bootstrap is a fail-fast operation on bad configuration.
  • :max_errors (default 10) — aggregate cap on per-example errors. Exceeding returns a framework-classed error.
  • :deterministic (default false) — when true, phase 1 selection is hash-stable and phase-2 trainset order is hash-stable. Phase-2 LM outputs are still nondeterministic via temperature.
  • :diversity_temperature (default 1.0) — temperature forwarded as per-call opt during phase 2.

Returned stats

%{
  labeled_demos: non_neg_integer(),
  bootstrapped_demos: non_neg_integer(),
  predictor_count: non_neg_integer(),
  rounds: non_neg_integer(),
  error_count: non_neg_integer(),
  max_errors: non_neg_integer(),
  threshold: float()
}

Errors

Per-example raises are caught and stamped with path: [:bootstrap_few_shot, :"round_R", :"example_I"]. When error_count > max_errors, compile/4 returns {:error, %Dsxir.Errors.Framework.OptimizerError{optimizer: __MODULE__, inner: aggregate}} where inner is an aggregate produced via Splode's to_class helper on Dsxir.Errors. Callers can traverse per-predictor sub-errors via Splode's traverse_errors helper.

Trainset hash

metadata.trainset_hash is :crypto.hash(:sha256, :erlang.term_to_binary(trainset)) |> Base.encode16(case: :lower).