All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

0.3.0 - 2026-02-06

Changed

  • Upgraded SnakeBridge from 0.15.1 to 0.16.0 (Snakepit 0.12.0 to 0.13.0)
  • Regenerated all vLLM wrappers with SnakeBridge v0.16.0

0.2.1 - 2026-01-25

Changed

  • Updated to SnakeBridge 0.15.1
  • Regenerated docs with improved enum/member summaries and doctest rendering

0.2.0 - 2026-01-25

Added

  • vLLM public API surface generation via SnakeBridge (module_mode: :docs, driven by committed priv/snakebridge/vllm.docs.json)
  • Class method guardrail (max_class_methods) to prevent extremely large wrappers from inheritance-heavy internal classes
  • Coverage reports in .snakebridge/coverage/ for tracking API binding completeness
  • Credo configuration (.credo.exs) to exclude generated wrappers from static analysis
  • Python test script (basic_test.py) for direct vLLM validation
  • vLLM v1 multiprocessing configuration option (:auto, :on, :off)
  • Auto-set VLLM_WORKER_MULTIPROC_METHOD=spawn when v1 multiprocessing is enabled (avoids forking multi-threaded gRPC server)
  • Documentation for mix snakebridge.regen command with --clean option

Changed

  • Updated to SnakeBridge 0.15.0
  • Enhanced configuration guide with vLLM v1 multiprocessing documentation
  • Updated direct_api.exs to be wrapper-only while still demonstrating runtime attribute access for Python refs
  • Enhanced run_all.sh to compile once and run examples with --no-compile (avoids repeated codegen), plus better process cleanup and interrupt handling

Fixed

  • Dialyzer no longer hangs due to an excessively-large generated wrapper surface (docs-manifest generation + class method guardrails keep modules small).
  • Examples now create wrapper refs in the same runtime session as the LLM and pass __runtime__ opts, preventing cross-session reference errors.
  • Example docs and scripts now derive runtime_opts from the LLM ref consistently for generation, chat, and embeddings calls.

0.1.1 - 2026-01-10

Changed

  • Embedding models now use runner: "pooling" instead of task: "embed"
  • The embed/3 function now calls the embed method instead of encode

Improved

  • Rewrote examples to be runnable with real GPU inference
  • Added CLI flag support for examples (--model, --prompt, --adapter, etc.)
  • LoRA example auto-downloads a default adapter on first run
  • Updated documentation to reflect embedding API changes

Fixed

  • Added generated files to .gitignore (examples/assets/, registry.json)

0.1.0 - 2026-01-08

  • Initial release