ADR-0016: Test strategy for external dependencies

View Source

Status

Proposed

Context

Conjure has several components that depend on external systems:

ComponentDependencyCurrent Testing
Docker ExecutorDocker daemonMinimal unit tests
TelemetryEvent handlersNo handler tests
File operationsFilesystemFixture-based
API integrationClaude APINot tested

Current test suite (53 tests) covers core functionality but has gaps:

  1. Docker tests - Skip if Docker unavailable, limited coverage
  2. Telemetry tests - Events emitted but handlers not verified
  3. Integration tests - No end-to-end conversation flow tests
  4. Error path tests - Happy paths tested, error conditions less so

The challenge: how to test components with external dependencies without:

  • Requiring Docker in CI
  • Making tests slow or flaky
  • Reducing confidence in the code

Decision

We will implement a layered testing strategy:

Layer 1: Unit Tests (No External Dependencies)

All modules should have unit tests that mock external dependencies:

defmodule Conjure.Executor.DockerTest do
  use ExUnit.Case

  # Mock the System.cmd calls
  import Mox
  setup :verify_on_exit!

  describe "bash/2" do
    test "formats docker exec command correctly" do
      expect(SystemMock, :cmd, fn "docker", args, _opts ->
        assert ["exec", "container123", "bash", "-c", "echo hello"] = args
        {"hello\n", 0}
      end)

      context = %ExecutionContext{container_id: "container123"}
      assert {:ok, "hello\n"} = Docker.bash("echo hello", context)
    end

    test "returns error on non-zero exit" do
      expect(SystemMock, :cmd, fn _, _, _ -> {"error", 1} end)

      context = %ExecutionContext{container_id: "container123"}
      assert {:error, {:exit_code, 1, "error"}} = Docker.bash("false", context)
    end
  end
end

Mox Setup:

# test/support/mocks.ex
Mox.defmock(SystemMock, for: Conjure.SystemBehaviour)

# lib/conjure/system_behaviour.ex
defmodule Conjure.SystemBehaviour do
  @callback cmd(String.t(), [String.t()], keyword()) :: {String.t(), non_neg_integer()}
end

Layer 2: Integration Tests (Tagged, Optional)

Tests that require external dependencies are tagged and skipped by default:

defmodule Conjure.Executor.DockerIntegrationTest do
  use ExUnit.Case

  @moduletag :integration
  @moduletag :docker

  setup do
    # Skip if Docker not available
    case System.cmd("docker", ["info"], stderr_to_stdout: true) do
      {_, 0} -> :ok
      _ -> {:skip, "Docker not available"}
    end
  end

  describe "full Docker lifecycle" do
    @tag timeout: 60_000
    test "init/bash/cleanup cycle" do
      context = ExecutionContext.new(working_directory: "/tmp/test")

      {:ok, ctx} = Docker.init(context)
      assert ctx.container_id != nil

      {:ok, output} = Docker.bash("echo 'hello from container'", ctx)
      assert output =~ "hello from container"

      :ok = Docker.cleanup(ctx)
    end
  end
end

Running integration tests:

# Unit tests only (default)
mix test

# Include Docker integration tests
mix test --include docker

# All integration tests
mix test --include integration

# CI with Docker available
mix test --include integration

Layer 3: Telemetry Tests

Verify telemetry events are emitted correctly:

defmodule Conjure.TelemetryTest do
  use ExUnit.Case

  setup do
    # Attach test handler
    :telemetry.attach_many(
      "test-handler",
      [
        [:conjure, :execute, :start],
        [:conjure, :execute, :stop],
        [:conjure, :execute, :exception]
      ],
      &__MODULE__.handle_event/4,
      %{test_pid: self()}
    )

    on_exit(fn -> :telemetry.detach("test-handler") end)
  end

  def handle_event(event, measurements, metadata, %{test_pid: pid}) do
    send(pid, {:telemetry, event, measurements, metadata})
  end

  test "execute emits start and stop events" do
    # Trigger execution
    Conjure.execute(tool_call, skills, executor: MockExecutor)

    assert_receive {:telemetry, [:conjure, :execute, :start], _, %{tool: "view"}}
    assert_receive {:telemetry, [:conjure, :execute, :stop], %{duration: _}, _}
  end
end

Layer 4: Contract Tests

Verify all executors implement the behaviour correctly:

defmodule Conjure.ExecutorContractTest do
  use ExUnit.Case

  # Test each executor against the contract
  for executor <- [Conjure.Executor.Local, Conjure.Executor.Docker] do
    @executor executor

    describe "#{@executor} contract" do
      test "implements all required callbacks" do
        behaviours = @executor.__info__(:attributes)[:behaviour] || []
        assert Conjure.Executor in behaviours
      end

      test "bash/2 returns expected format" do
        # Use mock context appropriate for executor
        context = build_context_for(@executor)
        result = @executor.bash("echo test", context)

        assert match?({:ok, _}, result) or match?({:error, _}, result)
      end
    end
  end
end

Layer 5: Property-Based Tests

For security-critical functions:

defmodule Conjure.SecurityPropertyTest do
  use ExUnit.Case
  use ExUnitProperties

  describe "escape_shell/1" do
    property "never produces unbalanced quotes" do
      check all input <- string(:printable) do
        escaped = Security.escape_shell(input)
        # Count quotes should be balanced
        assert balanced_quotes?(escaped)
      end
    end

    property "escaped output is safe for shell" do
      check all input <- string(:printable) do
        escaped = Security.escape_shell(input)
        # Should be able to round-trip through shell
        {output, 0} = System.cmd("bash", ["-c", "echo #{escaped}"])
        assert String.trim(output) == input
      end
    end
  end
end

CI Configuration

# .github/workflows/test.yml
name: Tests

on: [push, pull_request]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: erlef/setup-beam@v1
        with:
          elixir-version: '1.16'
          otp-version: '26'
      - run: mix deps.get
      - run: mix test

  integration-tests:
    runs-on: ubuntu-latest
    services:
      docker:
        image: docker:dind
        options: --privileged
    steps:
      - uses: actions/checkout@v4
      - uses: erlef/setup-beam@v1
      - run: mix deps.get
      - run: mix test --include integration

Test Organization

test/
 conjure_test.exs              # Main API tests
 conjure/
    loader_test.exs           # Unit tests
    registry_test.exs
    security_test.exs
    executor/
       local_test.exs        # Unit tests
       docker_test.exs       # Unit tests (mocked)
       docker_integration_test.exs  # @tag :docker
    telemetry_test.exs
 integration/
    conversation_test.exs     # @tag :integration
    end_to_end_test.exs       # @tag :integration
 property/
    security_property_test.exs
 support/
    mocks.ex
    fixtures.ex
    test_helpers.ex
 fixtures/
     skills/

Consequences

Positive

  • Fast default tests - Unit tests run without Docker
  • Comprehensive coverage - Integration tests catch real issues
  • CI flexibility - Can run different test suites
  • Documented testing patterns - Clear examples for contributors
  • Property tests - Catch edge cases in security code

Negative

  • More complex test setup
  • Mox dependency added
  • Integration tests may be flaky
  • StreamData dependency for property tests

Neutral

  • Tests mirror production architecture
  • Contributors must understand test layers
  • CI time increases with integration tests

Alternatives Considered

Always Require Docker

Make Docker a hard requirement for tests. Rejected because:

  • Increases contributor friction
  • Slows down test cycle
  • Not always available (some CI environments)

No Integration Tests

Only unit test with mocks. Rejected because:

  • Misses real integration issues
  • Docker executor bugs would slip through
  • Reduces confidence in production behavior

Testcontainers

Use testcontainers-elixir for Docker management. Deferred because:

  • Adds significant dependency
  • May be overkill for current needs
  • Can add later if needed

References