Changelog
View SourceAll notable changes to this project will be documented in this file.
[0.9.0] - 2026-01-04
Added
Evaluation Framework: Production-grade testing and benchmarking for AI agents
Nous.Evalmodule for defining and running test suitesNous.Eval.Suitefor test suite management with YAML supportNous.Eval.TestCasefor individual test case definitionsNous.Eval.Runnerfor sequential and parallel test executionNous.Eval.Metricsfor collecting latency, token usage, and cost metricsNous.Eval.Reporterfor console and JSON result reporting- A/B testing support with
Nous.Eval.run_ab/2
Six Built-in Evaluators:
:exact_match- Strict string equality matching:fuzzy_match- Jaro-Winkler similarity with configurable thresholds:contains- Substring and regex pattern matching:tool_usage- Tool call verification with argument validation:schema- Ecto schema validation for structured outputs:llm_judge- LLM-based quality assessment with custom rubrics
Optimization Engine: Automated parameter tuning for agents
Nous.Eval.Optimizerwith three strategies: grid search, random search, Bayesian optimization- Support for float, integer, choice, and boolean parameter types
- Early stopping on threshold achievement
- Detailed trial history and best configuration reporting
New Mix Tasks:
mix nous.eval- Run evaluation suites with filtering, parallelism, and multiple output formatsmix nous.optimize- Parameter optimization with configurable strategies and metrics
New Dependency:
yaml_elixir ~> 2.9for YAML test suite parsing
Documentation
- New comprehensive evaluation framework guide (
docs/guides/evaluation.md) - Five new example scripts in
examples/eval/:01_basic_evaluation.exs- Simple test execution02_yaml_suite.exs- Loading and running YAML suites03_optimization.exs- Parameter optimization workflows04_custom_evaluator.exs- Implementing custom evaluators05_ab_testing.exs- A/B testing configurations
[0.8.1] - 2025-12-31
Fixed
- Fixed
Usagestruct not implementing Access behaviour for telemetry metrics - Fixed
Task.shutdown/2nil return case inAgentServercancellation - Fixed tool call field access for OpenAI-compatible APIs (string vs atom keys)
Added
- Vision/multimodal test suite with image fixtures (
test/nous/vision_test.exs) - ContentPart test suite for image conversion utilities (
test/nous/content_part_test.exs) - Multimodal message examples in conversation demo (
examples/04_conversation.exs)
Changed
- Updated docs to link examples to GitHub source files
- Improved sidebar grouping in hexdocs
[0.8.0] - 2025-12-31
Added
Context Management: New
Nous.Agent.Contextstruct for immutable conversation state, message history, and dependency injection. Supports context continuation between runs:{:ok, result1} = Nous.run(agent, "My name is Alice") {:ok, result2} = Nous.run(agent, "What's my name?", context: result1.context)Agent Behaviour: New
Nous.Agent.Behaviourfor implementing custom agents with lifecycle callbacks (init_context/2,build_messages/2,process_response/3,extract_output/2).Dual Callback System: New
Nous.Agent.Callbackssupporting both map-based callbacks and process messages:# Map callbacks Nous.run(agent, "Hello", callbacks: %{ on_llm_new_delta: fn _event, delta -> IO.write(delta) end }) # Process messages (for LiveView) Nous.run(agent, "Hello", notify_pid: self())Module-Based Tools: New
Nous.Tool.Behaviourfor defining tools as modules withmetadata/0andexecute/2callbacks. UseNous.Tool.from_module/2to create tools from modules.Tool Context Updates: New
Nous.Tool.ContextUpdatestruct allowing tools to modify context state:def my_tool(ctx, args) do {:ok, result, ContextUpdate.new() |> ContextUpdate.set(:key, value)} endTool Testing Helpers: New
Nous.Tool.Testingmodule withmock_tool/2,spy_tool/1, andtest_context/1for testing tool interactions.Tool Validation: New
Nous.Tool.Validatorfor JSON Schema validation of tool arguments.Prompt Templates: New
Nous.PromptTemplatefor EEx-based prompt templates with variable substitution.Built-in Agent Implementations:
Nous.Agents.BasicAgent(default) andNous.Agents.ReActAgent(reasoning with planning tools).Structured Errors: New
Nous.Errorsmodule withMaxIterationsReached,ToolExecutionError, andExecutionCancellederror types.Enhanced Telemetry: New events for iterations (
:iteration), tool timeouts (:tool_timeout), and context updates (:context_update).
Changed
Result Structure:
Nous.run/3now returns%{output: _, context: _, usage: _}instead of just output string.Tool Function Signature: Tools now receive
(ctx, args)instead of(args). The context provides access toctx.depsfor dependency injection.Examples Modernized: Reduced from ~95 files to 21 files. Flattened directory structure from 4 levels to 2 levels. All examples updated to v0.8.0 API.
Removed
Removed deprecated provider modules:
Nous.Providers.Gemini,Nous.Providers.Mistral,Nous.Providers.VLLM,Nous.Providers.SGLang.Removed built-in tools:
Nous.Tools.BraveSearch,Nous.Tools.DateTimeTools,Nous.Tools.StringTools,Nous.Tools.TodoTools. These can be implemented as custom tools.Removed
Nous.RunContext(replaced byNous.Agent.Context).Removed
Nous.PromEx.Plugin(users can implement custom Prometheus metrics using telemetry events).
[0.7.2] - 2025-12-29
Fixed
Stream completion events: The
[DONE]SSE event now properly emits a{:finish, "stop"}event instead of being silently discarded. This ensures stream consumers always receive a completion signal.Documentation links: Fixed broken links in hexdocs documentation. Relative links to
.exsexample files now use absolute GitHub URLs so they work correctly on hexdocs.pm.
[0.7.1] - 2025-12-29
Changed
Make all provider dependencies optional:
openai_ex,anthropix, andgemini_exare now truly optional dependencies. Users only need to install the dependencies for the providers they use.Runtime dependency checks: Provider modules now check for dependency availability at runtime instead of compile-time, allowing the library to compile without any provider-specific dependencies.
OpenAI message format: Messages are now returned as plain maps with string keys (
%{"role" => "user", "content" => "Hi"}) instead ofOpenaiEx.ChatMessagestructs. This removes the compile-time dependency onopenai_exfor message formatting.
Fixed
Fixed "anthropix dependency not available" errors that occurred when using the library in applications without
anthropixinstalled.Fixed compile-time errors that occurred when
openai_exwas not present in the consuming application.
[0.7.0] - 2025-12-27
Initial public release with multi-provider LLM support:
- OpenAI-compatible providers (OpenAI, Groq, OpenRouter, Ollama, LM Studio, vLLM)
- Native Anthropic Claude support with extended thinking
- Google Gemini support
- Mistral AI support
- Tool/function calling
- Streaming support
- ReAct agent implementation