Download GGUF models from HuggingFace Hub.
Requires the optional req dependency. Add it to your mix.exs:
{:req, "~> 0.5"}Examples
# Search for GGUF models
{:ok, results} = LlamaCppEx.Hub.search("qwen3 gguf", limit: 5)
# List GGUF files in a repository
{:ok, files} = LlamaCppEx.Hub.list_gguf_files("Qwen/Qwen3-4B-GGUF")
# Download a model (cached locally)
{:ok, path} = LlamaCppEx.Hub.download(
"Qwen/Qwen3-4B-GGUF",
"qwen3-4b-q4_k_m.gguf"
)Authentication
For private or gated repositories, set the HF_TOKEN environment variable
or pass the :token option:
LlamaCppEx.Hub.download("org/private-model", "model.gguf", token: "hf_...")Caching
Downloaded files are cached in ~/.cache/llama_cpp_ex/models/ by default.
Override with the :cache_dir option or LLAMA_CACHE_DIR environment variable.
ETag headers are stored alongside cached files to detect upstream changes.
Offline Mode
Set LLAMA_OFFLINE=1 to use only cached files without network access.
Summary
Functions
Build authentication headers from options or environment.
Build the download URL for a file in a HuggingFace repository.
Build the local cache path for a model file.
Download a GGUF file from HuggingFace Hub, returning the local path.
Filter a list of HuggingFace siblings entries to only GGUF files.
Get model repository metadata from HuggingFace Hub API.
List GGUF files available in a HuggingFace repository.
Search HuggingFace Hub for GGUF models.
Functions
Build authentication headers from options or environment.
Checks for tokens in order: :token option, HF_TOKEN env var,
HUGGING_FACE_HUB_TOKEN env var (legacy).
Build the download URL for a file in a HuggingFace repository.
Build the local cache path for a model file.
Download a GGUF file from HuggingFace Hub, returning the local path.
Uses ETag-based caching — if the file exists locally and the ETag matches, the cached version is returned without re-downloading.
Options
:cache_dir- Local cache directory. Defaults to~/.cache/llama_cpp_ex/models/or theLLAMA_CACHE_DIRenvironment variable.:token- HuggingFace API token. Defaults toHF_TOKENenvironment variable.:revision- Git revision (branch, tag, or commit). Defaults to"main".:force- Force re-download even if cached. Defaults tofalse.
Filter a list of HuggingFace siblings entries to only GGUF files.
Returns maps with :filename and :size.
Get model repository metadata from HuggingFace Hub API.
Options
:token- HuggingFace API token.
@spec list_gguf_files( String.t(), keyword() ) :: {:ok, [%{filename: String.t(), size: integer()}]} | {:error, String.t()}
List GGUF files available in a HuggingFace repository.
Returns a list of maps with :filename and :size (bytes).
Options
:token- HuggingFace API token.
Examples
{:ok, files} = LlamaCppEx.Hub.list_gguf_files("Qwen/Qwen3-4B-GGUF")
Enum.each(files, fn f ->
size_mb = Float.round(f.size / 1_000_000, 1)
IO.puts("#{f.filename} (#{size_mb} MB)")
end)
Search HuggingFace Hub for GGUF models.
Returns a list of model info maps with :id, :downloads, :likes,
:last_modified, and :tags.
Options
:limit- Maximum results. Defaults to10.:sort- Sort by"downloads","likes", or"lastModified". Defaults to"downloads".:direction- Sort direction,-1for descending. Defaults to-1.:token- HuggingFace API token.
Examples
{:ok, models} = LlamaCppEx.Hub.search("llama gguf q4")
Enum.each(models, fn m -> IO.puts("#{m.id} (#{m.downloads} downloads)") end)