How to upgrade the llama.cpp submodule and publish a new release.
Prerequisites
- Elixir 1.18+, Erlang/OTP 27+
- cmake
- A GGUF model file for testing (e.g. Qwen3.5-0.8B)
- An embedding model file for embedding tests (e.g. Qwen3-Embedding-0.6B)
1. Update the submodule
# Fetch latest upstream commits
git -C vendor/llama.cpp fetch origin
# Check what's new since the current pin
git -C vendor/llama.cpp log --oneline HEAD..origin/master
# Checkout the target commit
git -C vendor/llama.cpp checkout <commit-hash>
2. Check API compatibility
Before building, verify the llama.cpp APIs used by the NIF haven't changed:
# Diff the public header between old and new commits
git -C vendor/llama.cpp diff <old-commit>..<new-commit> -- include/llama.h
# Diff common headers used by the NIF
git -C vendor/llama.cpp diff <old-commit>..<new-commit> -- common/chat.h
git -C vendor/llama.cpp diff <old-commit>..<new-commit> -- common/json-schema-to-grammar.h
The NIF uses these key APIs (grep llama_nif.cpp for the full list):
llama_model_*,llama_context_*,llama_vocab_*— model/context/vocab managementllama_tokenize,llama_detokenize,llama_token_to_piece— tokenizationllama_batch_*,llama_decode— inferencellama_sampler_*— sampling chainllama_memory_*— KV cache / memory managementllama_get_embeddings_*,llama_pooling_type— embeddingsllama_chat_apply_template— legacy chat templatescommon_chat_templates_init,common_chat_templates_apply— Jinja chat templatesjson_schema_to_grammar— grammar generation
If any signatures changed, update c_src/llama_cpp_ex/llama_nif.cpp and/or llama_nif.h.
3. Build and test
# Bump version first (so it builds from source instead of downloading precompiled)
# Edit mix.exs @version
# Clean build
mix clean && mix compile
# Run full test suite
LLAMA_MODEL_PATH=~/Downloads/Qwen3.5-0.8B-UD-Q4_K_XL.gguf \
LLAMA_EMBEDDING_MODEL_PATH=~/Downloads/Qwen3-Embedding-0.6B-f16.gguf \
mix test
# Verify formatting and types
mix format --check-formatted
mix dialyzer
4. Update version and changelog
mix.exsline 40: bump@version(e.g."0.6.5"→"0.6.6")CHANGELOG.md: add a new## vX.Y.Zsection at the top with:- The submodule commit range and count
- Notable changes categorized by subsystem (follow existing format)
To list commits for the changelog:
git -C vendor/llama.cpp log --oneline <old-commit>..<new-commit>
5. Commit
git add vendor/llama.cpp mix.exs CHANGELOG.md
git commit -m "Bump llama.cpp to <short-hash>, release vX.Y.Z"
6. Tag and push
git tag vX.Y.Z
git push origin master
git push origin vX.Y.Z
The tag push triggers the precompile workflow (.github/workflows/precompile.yml) which:
- Builds precompiled NIFs for macOS (Metal) and Linux (CPU) across OTP 27 and 28
- Uploads
.tar.gzartifacts to the GitHub release - Runs
mix elixir_make.checksum --all --ignore-unavailableand auto-commitschecksum.exsto master
7. Publish to Hex
After the CI checksum commit lands:
git pull origin master # get the updated checksum.exs
mix hex.publish
Troubleshooting
Compilation errors after upgrade
- Missing function: check if the API was renamed or removed in
include/llama.h - Struct field changes: check
llama_model_params,llama_context_params,llama_batchstructs - Common library changes:
common/chat.his the most volatile dependency — checkcommon_chat_templates_inputsandcommon_chat_msg
Build downloads precompiled binary instead of compiling from source
The precompiler fetches binaries by version. If you haven't bumped the version, it'll find and use the old binary. Bump @version in mix.exs before running mix compile.
CI precompile fails
Check .github/workflows/precompile.yml for the build matrix. Common issues:
- New llama.cpp dependencies not available in CI runners
- CMake flag changes requiring updates to the
Makefile