LlamaCppEx.Sampler (LlamaCppEx v0.7.0)

Copy Markdown View Source

Token sampling configuration.

Builds a sampler chain with the common sampling parameters. The samplers are applied in order: grammar -> penalties -> top_k -> top_p -> min_p -> temp -> dist/greedy.

Summary

Functions

Accepts a token (updates sampler internal state).

Creates a new sampler chain.

Resets the sampler state.

Samples the next token from the context's logits.

Types

t()

@type t() :: %LlamaCppEx.Sampler{ref: reference()}

Functions

accept(sampler, token)

@spec accept(t(), integer()) :: :ok

Accepts a token (updates sampler internal state).

create(model, opts \\ [])

@spec create(
  LlamaCppEx.Model.t(),
  keyword()
) :: {:ok, t()}

Creates a new sampler chain.

Requires a model reference (needed for grammar-constrained sampling).

Options

  • :seed - Random seed for sampling. Defaults to a random value.
  • :temp - Temperature. 0.0 for greedy sampling. Defaults to 0.8.
  • :top_k - Top-K filtering. 0 to disable. Defaults to 40.
  • :top_p - Top-P (nucleus) filtering. 1.0 to disable. Defaults to 0.95.
  • :min_p - Min-P filtering. 0.0 to disable. Defaults to 0.05.
  • :penalty_repeat - Repetition penalty. 1.0 to disable. Defaults to 1.0.
  • :penalty_freq - Frequency penalty (0.0–2.0). 0.0 to disable. Defaults to 0.0.
  • :penalty_present - Presence penalty (0.0–2.0). 0.0 to disable. Defaults to 0.0.
  • :grammar - GBNF grammar string for constrained generation. Defaults to "" (none).
  • :grammar_root - Root rule name for grammar. Defaults to "root".

reset(sampler)

@spec reset(t()) :: :ok

Resets the sampler state.

sample(sampler, context)

@spec sample(t(), LlamaCppEx.Context.t()) :: integer()

Samples the next token from the context's logits.