LlamaCppEx.Model (LlamaCppEx v0.7.0)

Copy Markdown View Source

Model loading and introspection.

Summary

Functions

Returns the chat template string embedded in the model, or nil if none.

Returns a human-readable description of the model.

Loads a GGUF model from the given file path.

Returns the training context size of the model.

Returns the embedding dimension of the model.

Returns the number of model parameters.

Returns the model file size in bytes.

Types

t()

@type t() :: %LlamaCppEx.Model{ref: reference()}

Functions

chat_template(model)

@spec chat_template(t()) :: String.t() | nil

Returns the chat template string embedded in the model, or nil if none.

desc(model)

@spec desc(t()) :: String.t()

Returns a human-readable description of the model.

load(path, opts \\ [])

@spec load(
  String.t(),
  keyword()
) :: {:ok, t()} | {:error, String.t()}

Loads a GGUF model from the given file path.

Options

  • :n_gpu_layers - Number of layers to offload to GPU. Use -1 for all layers. Defaults to 99 (offload all layers).
  • :use_mmap - Whether to memory-map the model file. Defaults to true.
  • :main_gpu - GPU device index for single-GPU mode. Defaults to 0.
  • :split_mode - How to split the model across GPUs: :none, :layer, or :row. Defaults to :none.
  • :tensor_split - List of floats specifying the proportion of work per GPU (e.g. [0.5, 0.5] for two GPUs). Defaults to [].
  • :use_mlock - Pin model memory in RAM to prevent swapping. Defaults to false.
  • :use_direct_io - Bypass page cache when loading (takes precedence over mmap). Defaults to false.
  • :vocab_only - Load vocabulary and metadata only, skip weights. Defaults to false.

Examples

{:ok, model} = LlamaCppEx.Model.load("path/to/model.gguf", n_gpu_layers: -1)
{:ok, model} = LlamaCppEx.Model.load("path/to/model.gguf", split_mode: :layer, tensor_split: [0.5, 0.5])
{:ok, model} = LlamaCppEx.Model.load("path/to/model.gguf", vocab_only: true)

n_ctx_train(model)

@spec n_ctx_train(t()) :: integer()

Returns the training context size of the model.

n_embd(model)

@spec n_embd(t()) :: integer()

Returns the embedding dimension of the model.

n_params(model)

@spec n_params(t()) :: integer()

Returns the number of model parameters.

size(model)

@spec size(t()) :: integer()

Returns the model file size in bytes.