Coverage Testing Guide
View SourceNote: This guide is largely superseded by the Fixture Testing Guide, which covers the modern
mix req_llm.model_compattask and comprehensive testing system. This document remains for reference on legacy testing patterns.
This guide covers testing and verification workflows for ReqLLM, focusing on live API coverage tests with fixture support for local testing without API calls.
Overview
ReqLLM's testing system is built around two core principles:
- Provider coverage testing - Tests verify that provider implementations work correctly across different features
- Fixture-based testing - Tests can run against live APIs or cached fixtures for fast local development
Testing Modes
Fixture Mode (Default)
By default, tests use cached fixtures for fast, reliable testing:
mix test                    # Uses fixtures
mix test --only openai      # Test specific provider with fixtures
Live Mode
Set REQ_LLM_FIXTURES_MODE=record to test against real APIs and capture new fixtures:
REQ_LLM_FIXTURES_MODE=record mix test                    # Run all tests live
REQ_LLM_FIXTURES_MODE=record mix test --only openai      # Test specific provider live
REQ_LLM_FIXTURES_MODE=record mix test --only coverage    # Run coverage tests live
Live mode will:
- Make real API calls to providers
- Capture responses as JSON fixtures
- Overwrite existing fixtures with new responses
- Require valid API keys for each provider
Quality & CI
CI runs mix quality alias before tests. Locally:
mix quality    # or mix q - runs format, compile --warnings-as-errors, dialyzer, credo
Test Organization
Directory Structure
test/
├── coverage/                 # Provider capability coverage tests
│   ├── anthropic/
│   │   ├── comprehensive_test.exs   # All capabilities
│   │   └── fixtures/                # Cached API responses
│   └── openai/
│       ├── comprehensive_test.exs
│       └── fixtures/
├── support/
│   ├── live_fixture.ex       # Test fixture system
│   └── provider_test/        # Shared test macros
├── req_llm/
└── req_llm_test.exs         # Core library testsTest Tags
Tests use ExUnit tags for organization:
@moduletag :coverage           # Coverage test
@moduletag provider: "anthropic"  # Provider-specific (string)
@tag scenario: :basic          # Scenario-specific (atom)
@tag scenario: :streaming      # Feature-specific
@tag scenario: :tool_multi     # Capability-specificRun specific test groups:
mix test --only coverage
mix test --only openai
mix test --only streaming
Writing Capability Tests
Using Provider Test Macros
ReqLLM uses shared test macros to eliminate duplication while maintaining clear per-provider organization:
defmodule ReqLLM.Coverage.MyProvider.CoreTest do
  use ReqLLM.ProviderTest.Core,
    provider: :my_provider,
    model: "my_provider:my-model"
  # Provider-specific tests can be added here
endAvailable macros:
- ReqLLM.ProviderTest.Comprehensive- All capabilities (basic, streaming, tools, objects, reasoning)
- ReqLLM.ProviderTest.Embedding- Embedding generation
Capability-Driven Tests
Verify capabilities match metadata before testing:
test "temperature parameter works as advertised" do
  # Check if model advertises temperature support
  supports_temp = ReqLLM.Capability.supports?(@model, :temperature)
  
  if supports_temp do
    result = use_fixture(:my_provider, "temperature_test", fn ->
      ctx = ReqLLM.Context.new([ReqLLM.Context.user("Be creative")])
      ReqLLM.generate_text(@model, ctx, temperature: 1.0, max_tokens: 50)
    end)
    
    {:ok, resp} = result
    assert resp.id != nil
  else
    skip("Model does not advertise temperature support")
  end
endTesting Tool Calling
Comprehensive tool calling tests:
describe "tool calling capabilities" do
  @weather_tool %{
    name: "get_weather",
    description: "Get weather for a location",
    parameter_schema: %{
      type: "object",
      properties: %{
        location: %{type: "string", description: "City name"}
      },
      required: ["location"]
    }
  }
  test "basic tool calling", fixture: "tool_calling_basic" do
    ctx = ReqLLM.Context.new([
      ReqLLM.Context.user("What's the weather in Paris?")
    ])
    
    {:ok, resp} = ReqLLM.generate_text(@model, ctx, 
      tools: [@weather_tool],
      max_tokens: 200
    )
    
    assert resp.id != nil
  end
  
  test "tool choice control" do
    if ReqLLM.Capability.supports?(@model, :tool_choice) do
      result = use_fixture(:my_provider, "tool_choice_specific", fn ->
        ctx = ReqLLM.Context.new([
          ReqLLM.Context.user("Tell me about weather")
        ])
        
        ReqLLM.generate_text(@model, ctx, 
          tools: [@weather_tool],
          tool_choice: %{type: "tool", name: "get_weather"}
        )
      end)
      
      {:ok, resp} = result
      assert resp.id != nil
    else
      skip("Model does not support tool choice control")
    end
  end
  test "tool result handling" do
    result = use_fixture(:my_provider, "tool_with_result", fn ->
      ctx = ReqLLM.Context.new([
        ReqLLM.Context.user("What's the weather like?"),
        ReqLLM.Context.assistant("", tool_calls: [
          %{id: "call_1", name: "get_weather", arguments: %{"location" => "Paris"}}
        ]),
        ReqLLM.Context.tool_result("call_1", %{"weather" => "sunny", "temp" => 22})
      ])
      
      ReqLLM.generate_text(@model, ctx, tools: [@weather_tool])
    end)
    
    {:ok, resp} = result
    assert resp.id != nil
  end
endTesting Streaming
Test streaming with proper chunk handling:
test "streaming text generation", fixture: "streaming_test" do
  if ReqLLM.Capability.supports?(@model, :streaming) do
    ctx = ReqLLM.Context.new([ReqLLM.Context.user("Tell me a story")])
    
    {:ok, resp} = ReqLLM.stream_text(@model, ctx, max_tokens: 100)
    
    assert resp.id != nil
    text = ReqLLM.Response.text(resp)
    assert is_binary(text)
  else
    skip("Model does not support streaming")
  end
endTesting Multimodal Capabilities
Test image and other modality support:
test "image input processing" do
  modalities = ReqLLM.Capability.modalities(@model)
  input_modalities = get_in(modalities, [:input]) || []
  
  if "image" in input_modalities do
    result = use_fixture(:my_provider, "image_input", fn ->
      # Base64 encoded test image
      image_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNkYPhfDwAChwGA60e6kgAAAABJRU5ErkJggg=="
      
      ctx = ReqLLM.Context.new([
        ReqLLM.Context.user([
          %{type: "text", text: "What do you see in this image?"},
          %{type: "image", source: %{
            type: "base64",
            media_type: "image/png", 
            data: image_data
          }}
        ])
      ])
      
      ReqLLM.generate_text(@model, ctx, max_tokens: 100)
    end)
    
    {:ok, resp} = result
    assert resp.id != nil
  else
    skip("Model does not support image input")
  end
endFixture Management
Fixture Format
Fixtures are stored as JSON with metadata:
{
  "captured_at": "2025-01-15T10:30:00Z",
  "model_spec": "openai:gpt-4o",
  "scenario": "basic",
  "result": {
    "ok": true,
    "response": {
      "id": "resp_123",
      "model": "gpt-4o",
      "message": {
        "role": "assistant",
        "content": [{"type": "text", "text": "Hello there!"}]
      },
      "usage": {"input_tokens": 5, "output_tokens": 3}
    }
  }
}Fixture Organization
Organize fixtures by provider and test name:
test/support/fixtures/
├── anthropic/
│   ├── basic_completion.json
│   ├── system_prompt_completion.json
│   ├── temperature_test.json
│   ├── streaming_test.json
│   ├── tool_calling_basic.json
│   ├── tool_choice_specific.json
│   └── tool_with_result.json
└── openai/
    ├── basic_completion.json
    └── tool_calling_basic.jsonLiveFixture API Changes (1.0.0-rc.1)
The LiveFixture API now requires the provider as the first argument:
# Current API (1.0.0-rc.1)
use_fixture(:provider_atom, "fixture_name", fn ->
  # test code
end)
# Old API (deprecated)
use_fixture("fixture_name", [], fn ->
  # test code  
end)Fixture Best Practices
- Descriptive naming - Use clear fixture names that indicate what they test
- Minimal responses - Use max_tokensto keep fixtures small
- Deterministic content - Use low temperature for reproducible responses
- Regular updates - Refresh fixtures when APIs change
# Good fixture usage
use_fixture(:openai, "low_temperature", fn ->
  ReqLLM.generate_text(@model, ctx, 
    temperature: 0.1,  # Deterministic
    max_tokens: 20     # Minimal
  )
end)Provider Verification Workflows
Adding a New Provider
- Create provider module with DSL
- Add metadata file in priv/models_dev/
- Create coverage tests using provider macros
- Run live tests to capture fixtures
- Validate capabilities match implementation
# Create provider tests using macros
# test/coverage/my_provider/core_test.exs
# test/coverage/my_provider/streaming_test.exs
# test/coverage/my_provider/tool_calling_test.exs
# Run live tests to capture fixtures
REQ_LLM_FIXTURES_MODE=record mix test --only coverage --only my_provider
# Quality check
mix quality
Ongoing Verification
Regular verification workflows:
# Daily: Validate all providers with fixtures
mix test --only coverage
# Weekly: Refresh critical fixtures
REQ_LLM_FIXTURES_MODE=record mix test test/coverage/*/comprehensive_test.exs
# Release: Full live test suite
REQ_LLM_FIXTURES_MODE=record mix test --only coverage
# API Changes: Update specific provider
REQ_LLM_FIXTURES_MODE=record mix test --only "provider:anthropic" --only coverage
Best Practices
Test Organization
- Use provider macros - Leverage shared test patterns for consistency
- Group by capability - Organize tests around features, not just providers
- Use descriptive names - Test names should explain what capability is tested
- Tag appropriately - Use ExUnit tags for selective test execution
Fixture Management
- Keep fixtures small - Use minimal token limits to reduce file size
- Use deterministic settings - Low temperature for consistent responses
- Version control fixtures - Commit fixtures to track API changes over time
- Update regularly - Refresh fixtures when provider APIs change
Error Handling
Test error conditions with proper fixture handling:
test "handles invalid model gracefully" do
  result = use_fixture(:anthropic, "invalid_model_error", fn ->
    ReqLLM.generate_text("anthropic:invalid-model", "Hello")
  end)
  
  {:error, error} = result
  assert %ReqLLM.Error.API{} = error
endEnvironment Management
Handle API keys and environment variables properly:
# Skip tests if API key not available  
# Keys are automatically loaded from .env via dotenvy at startup
setup do
  case ReqLLM.Keys.get(:anthropic_api_key) do
    {:ok, _key} -> :ok
    {:error, _reason} -> skip("ANTHROPIC_API_KEY not configured in .env")
  end
endThis coverage testing approach ensures that ReqLLM providers work correctly across all supported features and helps maintain compatibility as APIs evolve.