LlamaCppEx.Hub (LlamaCppEx v0.8.13)

Download GGUF models from HuggingFace Hub.

Requires the optional req dependency. Add it to your mix.exs:

{:req, "~> 0.5"}

Examples

# Search for GGUF models
{:ok, results} = LlamaCppEx.Hub.search("qwen3 gguf", limit: 5)

# List GGUF files in a repository
{:ok, files} = LlamaCppEx.Hub.list_gguf_files("Qwen/Qwen3-4B-GGUF")

# Download a model (cached locally)
{:ok, path} = LlamaCppEx.Hub.download(
  "Qwen/Qwen3-4B-GGUF",
  "qwen3-4b-q4_k_m.gguf"
)

Authentication

For private or gated repositories, set the HF_TOKEN environment variable or pass the :token option:

LlamaCppEx.Hub.download("org/private-model", "model.gguf", token: "hf_...")

Caching

Downloaded files are cached in ~/.cache/llama_cpp_ex/models/ by default. Override with the :cache_dir option or LLAMA_CACHE_DIR environment variable. ETag headers are stored alongside cached files to detect upstream changes.

Offline Mode

Set LLAMA_OFFLINE=1 to use only cached files without network access.

Summary

Functions

auth_headers(opts)

Build authentication headers from options or environment.

build_download_url(repo_id, filename, opts \\ [])

Build the download URL for a file in a HuggingFace repository.

cache_path(repo_id, filename, opts \\ [])

Build the local cache path for a model file.

download(repo_id, filename, opts \\ [])

Download a GGUF file from HuggingFace Hub, returning the local path.

filter_gguf_files(siblings)

Filter a list of HuggingFace siblings entries to only GGUF files.

get_model_info(repo_id, opts \\ [])

Get model repository metadata from HuggingFace Hub API.

list_gguf_files(repo_id, opts \\ [])

List GGUF files available in a HuggingFace repository.

search(query, opts \\ [])

Search HuggingFace Hub for GGUF models.

Functions

auth_headers(opts)

@spec auth_headers(keyword()) :: [{String.t(), String.t()}]

Build authentication headers from options or environment.

Checks for tokens in order: :token option, HF_TOKEN env var, HUGGING_FACE_HUB_TOKEN env var (legacy).

build_download_url(repo_id, filename, opts \\ [])

@spec build_download_url(String.t(), String.t(), keyword()) :: String.t()

Build the download URL for a file in a HuggingFace repository.

cache_path(repo_id, filename, opts \\ [])

@spec cache_path(String.t(), String.t(), keyword()) :: String.t()

Build the local cache path for a model file.

download(repo_id, filename, opts \\ [])

@spec download(String.t(), String.t(), keyword()) ::
  {:ok, String.t()} | {:error, String.t()}

Download a GGUF file from HuggingFace Hub, returning the local path.

Uses ETag-based caching — if the file exists locally and the ETag matches, the cached version is returned without re-downloading.

Options

:cache_dir - Local cache directory. Defaults to ~/.cache/llama_cpp_ex/models/ or the LLAMA_CACHE_DIR environment variable.
:token - HuggingFace API token. Defaults to HF_TOKEN environment variable.
:revision - Git revision (branch, tag, or commit). Defaults to "main".
:force - Force re-download even if cached. Defaults to false.

filter_gguf_files(siblings)

@spec filter_gguf_files([map()]) :: [%{filename: String.t(), size: integer()}]

Filter a list of HuggingFace siblings entries to only GGUF files.

Returns maps with :filename and :size.

get_model_info(repo_id, opts \\ [])

@spec get_model_info(
  String.t(),
  keyword()
) :: {:ok, map()} | {:error, String.t()}

Get model repository metadata from HuggingFace Hub API.

Options

:token - HuggingFace API token.

list_gguf_files(repo_id, opts \\ [])

@spec list_gguf_files(
  String.t(),
  keyword()
) :: {:ok, [%{filename: String.t(), size: integer()}]} | {:error, String.t()}

List GGUF files available in a HuggingFace repository.

Returns a list of maps with :filename and :size (bytes).

Options

:token - HuggingFace API token.

Examples

{:ok, files} = LlamaCppEx.Hub.list_gguf_files("Qwen/Qwen3-4B-GGUF")
Enum.each(files, fn f ->
  size_mb = Float.round(f.size / 1_000_000, 1)
  IO.puts("#{f.filename} (#{size_mb} MB)")
end)

search(query, opts \\ [])

@spec search(
  String.t(),
  keyword()
) :: {:ok, [map()]} | {:error, String.t()}

Search HuggingFace Hub for GGUF models.

Returns a list of model info maps with :id, :downloads, :likes, :last_modified, and :tags.

Options

:limit - Maximum results. Defaults to 10.
:sort - Sort by "downloads", "likes", or "lastModified". Defaults to "downloads".
:direction - Sort direction, -1 for descending. Defaults to -1.
:token - HuggingFace API token.

Examples

{:ok, models} = LlamaCppEx.Hub.search("llama gguf q4")
Enum.each(models, fn m -> IO.puts("#{m.id} (#{m.downloads} downloads)") end)