LlamaCppEx.Model (LlamaCppEx v0.8.13)

Model loading and introspection.

Summary

Types

t()

Functions

chat_template(model)

Returns the chat template string embedded in the model, or nil if none.

desc(model)

Returns a human-readable description of the model.

load(path, opts \\ [])

Loads a GGUF model from the given file path.

n_ctx_train(model)

Returns the training context size of the model.

n_embd(model)

Returns the embedding dimension of the model.

n_params(model)

Returns the number of model parameters.

size(model)

Returns the model file size in bytes.

Types

t()

@type t() :: %LlamaCppEx.Model{ref: reference()}

Functions

chat_template(model)

@spec chat_template(t()) :: String.t() | nil

Returns the chat template string embedded in the model, or nil if none.

desc(model)

@spec desc(t()) :: String.t()

Returns a human-readable description of the model.

load(path, opts \\ [])

@spec load(
  String.t(),
  keyword()
) :: {:ok, t()} | {:error, String.t()}

Loads a GGUF model from the given file path.

Options

:n_gpu_layers - Number of layers to offload to GPU. Use -1 for all layers. Defaults to 99 (offload all layers).
:use_mmap - Whether to memory-map the model file. Defaults to true.
:main_gpu - GPU device index for single-GPU mode. Defaults to 0.
:split_mode - How to split the model across GPUs: :none, :layer, or :row. Defaults to :none.
:tensor_split - List of floats specifying the proportion of work per GPU (e.g. [0.5, 0.5] for two GPUs). Defaults to [].
:use_mlock - Pin model memory in RAM to prevent swapping. Defaults to false.
:use_direct_io - Bypass page cache when loading (takes precedence over mmap). Defaults to false.
:vocab_only - Load vocabulary and metadata only, skip weights. Defaults to false.
:check_tensors - Validate model tensor data on load. Defaults to false.

Examples

{:ok, model} = LlamaCppEx.Model.load("path/to/model.gguf", n_gpu_layers: -1)
{:ok, model} = LlamaCppEx.Model.load("path/to/model.gguf", split_mode: :layer, tensor_split: [0.5, 0.5])
{:ok, model} = LlamaCppEx.Model.load("path/to/model.gguf", vocab_only: true)

n_ctx_train(model)

@spec n_ctx_train(t()) :: integer()

Returns the training context size of the model.

n_embd(model)

@spec n_embd(t()) :: integer()

Returns the embedding dimension of the model.

n_params(model)

@spec n_params(t()) :: integer()

Returns the number of model parameters.

size(model)

@spec size(t()) :: integer()

Returns the model file size in bytes.