Implementation Plan - TDD Approach
View SourceOverview
This document outlines a test-driven development (TDD) approach to implementing the Elixir Codex SDK. The plan is organized into four one-week sprints, with each sprint building on the previous one. All tests are written before implementation, using Supertester for deterministic OTP testing.
Guiding Principles
- Test First: Write tests before implementation
- Red-Green-Refactor: Fail → Pass → Clean
- Small Steps: Incremental progress with frequent validation
- Async Tests: All tests run with
async: true
- No Process.sleep: Use Supertester for proper synchronization
- Integration Last: Unit tests → integration tests → live tests
Sprint 0: Project Setup (Pre-development)
Goals
- [x] Initialize Mix project
- [x] Configure dependencies
- [x] Set up CI/CD pipeline
- [x] Create documentation structure
- [x] Define project standards
Tasks Completed
- [x] Run
mix new codex_sdk
- [x] Add dependencies to
mix.exs
- [x] jason (JSON parsing)
- [x] typed_struct (type definitions)
- [x] telemetry (observability)
- [x] supertester (testing)
- [x] mox (mocking)
- [x] ex_doc (documentation)
- [x] credo (linting)
- [x] dialyxir (type checking)
- [x] excoveralls (coverage)
- [x] Configure ExDoc with documentation structure
- [x] Set up GitHub Actions workflow
- [x] Create initial README
- [x] Write project documentation (01.md, 02-architecture.md, etc.)
Deliverables
- [x] Working Mix project
- [x] Passing
mix deps.get
- [x] Passing
mix compile
- [x] Green CI build
- [x] Complete documentation structure
Sprint 1: Type Definitions and Module Stubs
Duration: Week 1 Focus: Define all types, create module structure, establish test infrastructure
Phase 1.1: Event Types (Day 1)
Tests to Write
test/codex/events_test.exs
:
defmodule Codex.EventsTest do
use ExUnit.Case, async: true
describe "ThreadStarted" do
test "creates struct with required fields" do
event = %Codex.Events.ThreadStarted{
type: :thread_started,
thread_id: "thread_abc123"
}
assert event.type == :thread_started
assert event.thread_id == "thread_abc123"
end
test "enforces required fields" do
assert_raise ArgumentError, fn ->
%Codex.Events.ThreadStarted{}
end
end
test "encodes to JSON correctly" do
event = %Codex.Events.ThreadStarted{
type: :thread_started,
thread_id: "thread_abc123"
}
json = Jason.encode!(event)
decoded = Jason.decode!(json)
assert decoded["type"] == "thread_started"
assert decoded["thread_id"] == "thread_abc123"
end
end
# Similar tests for:
# - TurnStarted
# - TurnCompleted (with Usage)
# - TurnFailed
# - ItemStarted, ItemUpdated, ItemCompleted
# - ThreadError
end
Implementation
lib/codex/events.ex
:
defmodule Codex.Events do
@moduledoc "Event types emitted during turn execution"
end
defmodule Codex.Events.ThreadStarted do
use TypedStruct
typedstruct do
field :type, :thread_started, enforce: true
field :thread_id, String.t(), enforce: true
end
@derive Jason.Encoder
end
# Implement remaining event types...
Acceptance Criteria
- [ ] All event types defined with TypedStruct
- [ ] All fields documented
- [ ] JSON encoding/decoding working
- [ ] Tests passing (async: true)
- [ ] Dialyzer clean
- [ ] ExDoc generated
Phase 1.2: Item Types (Day 2)
Tests to Write
test/codex/items_test.exs
:
defmodule Codex.ItemsTest do
use ExUnit.Case, async: true
describe "AgentMessage" do
test "creates message item" do
item = %Codex.Items.AgentMessage{
id: "msg_1",
type: :agent_message,
text: "Hello, world!"
}
assert item.id == "msg_1"
assert item.type == :agent_message
assert item.text == "Hello, world!"
end
test "encodes to JSON" do
item = %Codex.Items.AgentMessage{
id: "msg_1",
type: :agent_message,
text: "Hello"
}
json = Jason.encode!(item)
decoded = Jason.decode!(json)
assert decoded["type"] == "agent_message"
end
end
describe "CommandExecution" do
test "creates command with status" do
item = %Codex.Items.CommandExecution{
id: "cmd_1",
type: :command_execution,
command: "ls -la",
aggregated_output: "",
status: :in_progress
}
assert item.command == "ls -la"
assert item.status == :in_progress
end
test "accepts completed status with exit code" do
item = %Codex.Items.CommandExecution{
id: "cmd_1",
type: :command_execution,
command: "ls -la",
aggregated_output: "total 0",
exit_code: 0,
status: :completed
}
assert item.exit_code == 0
assert item.status == :completed
end
end
# Similar tests for all item types
end
Implementation
lib/codex/items.ex
:
defmodule Codex.Items do
@moduledoc "Thread item types and their variants"
end
defmodule Codex.Items.AgentMessage do
use TypedStruct
typedstruct do
field :id, String.t(), enforce: true
field :type, :agent_message, default: :agent_message
field :text, String.t(), enforce: true
end
@derive Jason.Encoder
end
# Implement all item types...
Acceptance Criteria
- [ ] All item types defined
- [ ] Status enums documented
- [ ] JSON encoding working
- [ ] Tests passing
- [ ] Dialyzer clean
Phase 1.3: Option Structs (Day 3)
Tests to Write
test/codex/options_test.exs
:
defmodule Codex.OptionsTest do
use ExUnit.Case, async: true
describe "Codex.Options" do
test "creates with default values" do
opts = %Codex.Options{}
assert opts.codex_path_override == nil
assert opts.base_url == nil
assert opts.api_key == nil
end
test "creates with custom values" do
opts = %Codex.Options{
codex_path_override: "/usr/bin/codex",
api_key: "sk-test"
}
assert opts.codex_path_override == "/usr/bin/codex"
assert opts.api_key == "sk-test"
end
end
describe "Codex.Thread.Options" do
test "creates with defaults" do
opts = %Codex.Thread.Options{}
assert opts.sandbox_mode == nil
assert opts.skip_git_repo_check == false
end
test "validates sandbox mode" do
opts = %Codex.Thread.Options{
sandbox_mode: :read_only
}
assert opts.sandbox_mode == :read_only
end
end
# Tests for Turn.Options
end
Implementation
lib/codex/options.ex
:
defmodule Codex.Options do
use TypedStruct
typedstruct do
field :codex_path_override, String.t()
field :base_url, String.t()
field :api_key, String.t()
end
end
# Implement Thread.Options and Turn.Options...
Acceptance Criteria
- [ ] All option structs defined
- [ ] Defaults documented
- [ ] Validation helpers implemented
- [ ] Tests passing
Phase 1.4: Module Stubs (Day 4)
Tests to Write
test/codex_test.exs
:
defmodule CodexTest do
use ExUnit.Case, async: true
describe "start_thread/2" do
test "returns thread struct" do
assert {:ok, thread} = Codex.start_thread()
assert %Codex.Thread{} = thread
assert thread.thread_id == nil
end
test "accepts options" do
codex_opts = %Codex.Options{api_key: "sk-test"}
thread_opts = %Codex.Thread.Options{model: "o1"}
assert {:ok, thread} = Codex.start_thread(codex_opts, thread_opts)
assert thread.codex_opts.api_key == "sk-test"
assert thread.thread_opts.model == "o1"
end
end
describe "resume_thread/3" do
test "returns thread with ID" do
assert {:ok, thread} = Codex.resume_thread("thread_123")
assert thread.thread_id == "thread_123"
end
end
end
test/codex/thread_test.exs
:
defmodule Codex.ThreadTest do
use ExUnit.Case, async: true
describe "run/3" do
test "defined but not implemented" do
thread = %Codex.Thread{
thread_id: nil,
codex_opts: %Codex.Options{},
thread_opts: %Codex.Thread.Options{}
}
# Should compile but not work yet
assert function_exported?(Codex.Thread, :run, 3)
end
end
end
Implementation
lib/codex.ex
:
defmodule Codex do
@moduledoc "Main entry point for Codex SDK"
alias Codex.Thread
@spec start_thread(Codex.Options.t(), Codex.Thread.Options.t()) ::
{:ok, Thread.t()}
def start_thread(codex_opts \\ %Codex.Options{}, thread_opts \\ %Codex.Thread.Options{}) do
thread = %Thread{
thread_id: nil,
codex_opts: codex_opts,
thread_opts: thread_opts
}
{:ok, thread}
end
@spec resume_thread(String.t(), Codex.Options.t(), Codex.Thread.Options.t()) ::
{:ok, Thread.t()}
def resume_thread(thread_id, codex_opts \\ %Codex.Options{}, thread_opts \\ %Codex.Thread.Options{}) do
thread = %Thread{
thread_id: thread_id,
codex_opts: codex_opts,
thread_opts: thread_opts
}
{:ok, thread}
end
end
lib/codex/thread.ex
:
defmodule Codex.Thread do
@moduledoc "Manages conversation threads"
defstruct [:thread_id, :codex_opts, :thread_opts]
@type t :: %__MODULE__{
thread_id: String.t() | nil,
codex_opts: Codex.Options.t(),
thread_opts: Codex.Thread.Options.t()
}
@spec run(t(), String.t(), Codex.Turn.Options.t()) ::
{:ok, Codex.Turn.Result.t()} | {:error, term()}
def run(_thread, _input, _opts \\ %Codex.Turn.Options{}) do
raise "Not implemented"
end
@spec run_streamed(t(), String.t(), Codex.Turn.Options.t()) ::
{:ok, Enumerable.t()} | {:error, term()}
def run_streamed(_thread, _input, _opts \\ %Codex.Turn.Options{}) do
raise "Not implemented"
end
end
Acceptance Criteria
- [ ] All modules compile
- [ ] Stubs defined with typespecs
- [ ] Documentation stubs present
- [ ] Basic tests passing
- [ ] Dialyzer clean
Phase 1.5: OutputSchemaFile Utility (Day 5)
Tests to Write
test/codex/output_schema_file_test.exs
:
defmodule Codex.OutputSchemaFileTest do
use ExUnit.Case, async: true
describe "create/1" do
test "returns nil path for nil schema" do
assert {:ok, {nil, cleanup}} = Codex.OutputSchemaFile.create(nil)
assert is_function(cleanup, 0)
cleanup.()
end
test "creates temp file with schema" do
schema = %{"type" => "object", "properties" => %{}}
assert {:ok, {path, cleanup}} = Codex.OutputSchemaFile.create(schema)
assert is_binary(path)
assert File.exists?(path)
# Verify content
{:ok, content} = File.read(path)
assert Jason.decode!(content) == schema
# Cleanup removes file
cleanup.()
refute File.exists?(path)
end
test "cleanup is idempotent" do
schema = %{"type" => "object"}
assert {:ok, {_path, cleanup}} = Codex.OutputSchemaFile.create(schema)
cleanup.()
cleanup.() # Should not crash
end
test "returns error for invalid schema" do
assert {:error, _} = Codex.OutputSchemaFile.create("not a map")
end
end
end
Implementation
lib/codex/output_schema_file.ex
:
defmodule Codex.OutputSchemaFile do
@moduledoc "Helper for managing JSON schema temporary files"
@spec create(map() | nil) :: {:ok, {String.t() | nil, function()}} | {:error, term()}
def create(nil) do
cleanup = fn -> :ok end
{:ok, {nil, cleanup}}
end
def create(schema) when is_map(schema) do
# Implementation...
end
def create(_), do: {:error, :invalid_schema}
end
Acceptance Criteria
- [ ] Handles nil schema
- [ ] Creates temp file
- [ ] Writes JSON correctly
- [ ] Cleanup removes file
- [ ] Tests passing
Sprint 2: Exec GenServer Implementation
Duration: Week 2 Focus: Process management, Port communication, event parsing
Phase 2.1: GenServer Skeleton (Day 6)
Tests to Write
test/codex/exec_test.exs
:
defmodule Codex.ExecTest do
use ExUnit.Case, async: true
use Supertester
describe "start_link/1" do
test "starts GenServer successfully" do
opts = [
input: "Hello",
codex_path: "/bin/true" # Use /bin/true for testing
]
assert {:ok, pid} = Codex.Exec.start_link(opts)
assert Process.alive?(pid)
GenServer.stop(pid)
end
test "returns error for missing codex binary" do
opts = [
input: "Hello",
codex_path: "/nonexistent/codex"
]
assert {:error, _} = Codex.Exec.start_link(opts)
end
end
describe "init/1" do
test "spawns port with correct arguments" do
# Test with mock that logs spawn args
# Verify Port.open called with correct args
end
end
end
Implementation
lib/codex/exec.ex
:
defmodule Codex.Exec do
use GenServer
require Logger
defstruct [
:port,
:caller,
:ref,
buffer: "",
exit_status: nil,
stderr_buffer: ""
]
def start_link(opts) do
GenServer.start_link(__MODULE__, opts)
end
@impl true
def init(opts) do
# Validate codex path exists
# Build command args
# Spawn port
# Send telemetry
{:ok, %__MODULE__{}}
end
@impl true
def terminate(reason, state) do
# Close port
# Send telemetry
:ok
end
end
Acceptance Criteria
- [ ] GenServer starts successfully
- [ ] Port spawned
- [ ] Validation works
- [ ] Tests passing with Supertester
Phase 2.2: Port Communication (Day 7-8)
Tests to Write
defmodule Codex.Exec.PortTest do
use ExUnit.Case, async: true
use Supertester
describe "stdin writing" do
test "writes input to port" do
# Use echo command to test stdin
opts = [
input: "test input",
codex_path: "/bin/cat"
]
{:ok, pid} = Codex.Exec.start_link(opts)
# Assert output received
assert_receive {:stdout, "test input"}
GenServer.stop(pid)
end
end
describe "stdout reading" do
test "receives data from port" do
# Use script that outputs known data
end
test "handles incomplete lines" do
# Test buffer accumulation
end
end
describe "stderr handling" do
test "accumulates stderr" do
# Test stderr capture
end
end
describe "exit status" do
test "receives exit status 0" do
# Test successful exit
end
test "receives non-zero exit status" do
# Test failure exit
end
end
end
Implementation
@impl true
def handle_info({port, {:data, data}}, %{port: port} = state) do
# Append to buffer
# Split on newlines
# Process complete lines
# Keep incomplete line in buffer
{:noreply, state}
end
@impl true
def handle_info({port, {:exit_status, status}}, %{port: port} = state) do
# Handle exit
# Send final events
{:stop, :normal, state}
end
Acceptance Criteria
- [ ] Stdin writing works
- [ ] Stdout reading works
- [ ] Line buffering correct
- [ ] Exit status captured
- [ ] Tests passing
Phase 2.3: Event Parsing (Day 9)
Tests to Write
defmodule Codex.Exec.ParserTest do
use ExUnit.Case, async: true
describe "parse_event/1" do
test "parses ThreadStarted event" do
json = ~s({"type":"thread.started","thread_id":"thread_123"})
assert {:ok, event} = Codex.Exec.Parser.parse_event(json)
assert %Codex.Events.ThreadStarted{} = event
assert event.thread_id == "thread_123"
end
test "parses ItemCompleted with AgentMessage" do
json = ~s({"type":"item.completed","item":{"id":"msg_1","type":"agent_message","text":"Hello"}})
assert {:ok, event} = Codex.Exec.Parser.parse_event(json)
assert %Codex.Events.ItemCompleted{} = event
assert %Codex.Items.AgentMessage{} = event.item
end
test "returns error for invalid JSON" do
assert {:error, _} = Codex.Exec.Parser.parse_event("invalid")
end
test "returns error for unknown event type" do
json = ~s({"type":"unknown.event"})
assert {:error, _} = Codex.Exec.Parser.parse_event(json)
end
end
end
Implementation
lib/codex/exec/parser.ex
:
defmodule Codex.Exec.Parser do
@moduledoc "Parses JSONL events from codex-rs"
def parse_event(line) do
with {:ok, json} <- Jason.decode(line),
{:ok, event} <- build_event(json) do
{:ok, event}
end
end
defp build_event(%{"type" => "thread.started"} = json) do
# Build ThreadStarted struct
end
# More event builders...
end
Acceptance Criteria
- [ ] All event types parsed
- [ ] All item types parsed
- [ ] Error handling works
- [ ] Tests passing
Phase 2.4: Integration (Day 10)
Tests to Write
defmodule Codex.Exec.IntegrationTest do
use ExUnit.Case, async: true
use Supertester
@moduletag :integration
test "full turn execution with mock script" do
# Create mock script that emits test events
script_path = create_mock_script()
opts = [
input: "test",
codex_path: script_path
]
{:ok, pid} = Codex.Exec.start_link(opts)
ref = Codex.Exec.run_turn(pid)
# Assert events received in order
assert_receive {:event, ^ref, %Codex.Events.ThreadStarted{}}
assert_receive {:event, ^ref, %Codex.Events.TurnStarted{}}
assert_receive {:event, ^ref, %Codex.Events.ItemStarted{}}
assert_receive {:event, ^ref, %Codex.Events.ItemCompleted{}}
assert_receive {:event, ^ref, %Codex.Events.TurnCompleted{}}
assert_receive {:done, ^ref}
GenServer.stop(pid)
end
defp create_mock_script do
# Write shell script that outputs test events
end
end
Acceptance Criteria
- [ ] Full turn execution works
- [ ] Events in correct order
- [ ] Cleanup works
- [ ] Tests passing
Sprint 3: Thread Management
Duration: Week 3 Focus: Turn execution, streaming, option handling
Phase 3.1: Blocking Turn Execution (Day 11-12)
Tests to Write
defmodule Codex.Thread.RunTest do
use ExUnit.Case, async: true
use Supertester
describe "run/3" do
test "executes turn and returns result" do
thread = %Codex.Thread{
thread_id: nil,
codex_opts: %Codex.Options{},
thread_opts: %Codex.Thread.Options{}
}
# Mock Exec to return test events
with_mock_exec(fn ->
assert {:ok, result} = Codex.Thread.run(thread, "Hello")
assert %Codex.Turn.Result{} = result
assert result.final_response == "Hello, world!"
assert length(result.items) > 0
assert result.usage.input_tokens > 0
end)
end
test "populates thread_id after first turn" do
thread = %Codex.Thread{thread_id: nil, ...}
with_mock_exec(fn ->
{:ok, updated_thread, _result} = Codex.Thread.run(thread, "Hello")
assert updated_thread.thread_id == "thread_123"
end)
end
test "handles turn failure" do
thread = %Codex.Thread{...}
with_mock_exec(fn ->
assert {:error, {:turn_failed, error}} = Codex.Thread.run(thread, "Bad input")
assert error.message =~ "error"
end)
end
end
end
Implementation
def run(%Thread{} = thread, input, opts \\ %Codex.Turn.Options{}) do
{schema_path, cleanup} = OutputSchemaFile.create(opts.output_schema)
try do
{:ok, pid} = Exec.start_link(build_exec_opts(thread, input, schema_path))
ref = Exec.run_turn(pid)
result = collect_events(pid, ref)
{:ok, result}
after
cleanup.()
end
end
defp collect_events(pid, ref) do
# Accumulate events until done
# Build TurnResult
end
Acceptance Criteria
- [ ] Turn execution works
- [ ] Result accumulated correctly
- [ ] Thread ID updated
- [ ] Errors handled
- [ ] Tests passing
Phase 3.2: Streaming Turn Execution (Day 13-14)
Tests to Write
defmodule Codex.Thread.StreamedTest do
use ExUnit.Case, async: true
use Supertester
describe "run_streamed/3" do
test "returns stream of events" do
thread = %Codex.Thread{...}
with_mock_exec(fn ->
assert {:ok, stream} = Codex.Thread.run_streamed(thread, "Hello")
events = Enum.to_list(stream)
assert length(events) > 0
assert %Codex.Events.ThreadStarted{} = hd(events)
assert %Codex.Events.TurnCompleted{} = List.last(events)
end)
end
test "stream is lazy" do
# Verify events not processed until consumed
end
test "stream cleanup on halt" do
# Verify resources cleaned up if stream halted early
end
end
end
Implementation
def run_streamed(%Thread{} = thread, input, opts \\ %Codex.Turn.Options{}) do
{schema_path, cleanup} = OutputSchemaFile.create(opts.output_schema)
stream = Stream.resource(
fn -> start_turn(thread, input, schema_path, cleanup) end,
fn state -> fetch_next_event(state) end,
fn state -> cleanup_turn(state) end
)
{:ok, stream}
end
Acceptance Criteria
- [ ] Stream returns events
- [ ] Lazy evaluation
- [ ] Cleanup works
- [ ] Tests passing
Phase 3.3: Option Handling (Day 15)
Tests to Write
describe "option passing" do
test "passes codex options to exec" do
# Test API key, base URL passed
end
test "passes thread options to exec" do
# Test model, sandbox, working directory
end
test "passes turn options to exec" do
# Test output schema
end
end
Acceptance Criteria
- [ ] All options passed correctly
- [ ] Environment variables set
- [ ] Command args correct
- [ ] Tests passing
Sprint 4: Integration and Polish
Duration: Week 4 Focus: End-to-end tests, examples, documentation, CI/CD
Phase 4.1: Integration Tests (Day 16-17)
Tests to Write
defmodule Codex.IntegrationTest do
use ExUnit.Case
use Supertester
@moduletag :integration
describe "complete workflow" do
test "start thread, run turn, get result" do
{:ok, thread} = Codex.start_thread()
{:ok, result} = Codex.Thread.run(thread, "Say hello")
assert result.final_response =~ "hello"
end
test "multiple turns in same thread" do
{:ok, thread} = Codex.start_thread()
{:ok, result1} = Codex.Thread.run(thread, "Say hello")
{:ok, result2} = Codex.Thread.run(thread, "Say goodbye")
assert result2.items |> length() > 0
end
test "resume thread" do
{:ok, thread1} = Codex.start_thread()
{:ok, _result} = Codex.Thread.run(thread1, "Remember: banana")
# Resume with same ID
{:ok, thread2} = Codex.resume_thread(thread1.thread_id)
{:ok, result} = Codex.Thread.run(thread2, "What should I remember?")
assert result.final_response =~ "banana"
end
test "structured output" do
schema = %{
"type" => "object",
"properties" => %{
"message" => %{"type" => "string"}
}
}
{:ok, thread} = Codex.start_thread()
{:ok, result} = Codex.Thread.run(
thread,
"Say hello in JSON",
%Codex.Turn.Options{output_schema: schema}
)
assert {:ok, json} = Jason.decode(result.final_response)
assert json["message"]
end
end
end
Acceptance Criteria
- [ ] All integration tests passing
- [ ] Real codex-rs tested (when available)
- [ ] Mock tests passing
- [ ] Coverage > 95%
Phase 4.2: Examples (Day 18)
Create example scripts:
examples/basic.exs
: Simple conversationexamples/streaming.exs
: Real-time event processingexamples/structured_output.exs
: JSON schema usageexamples/multi_turn.exs
: Extended conversationexamples/file_operations.exs
: File change tracking
Each example should:
- [ ] Be runnable with
mix run examples/X.exs
- [ ] Include comments explaining each step
- [ ] Demonstrate best practices
- [ ] Handle errors gracefully
Phase 4.3: Documentation Polish (Day 19)
Tasks:
- [ ] Complete all @doc and @moduledoc
- [ ] Add @spec for all public functions
- [ ] Generate ExDoc HTML
- [ ] Review and update all guides
- [ ] Add diagrams where helpful
- [ ] Create CHANGELOG.md
- [ ] Update README with current status
Phase 4.4: CI/CD and Release (Day 20)
Tasks:
- [ ] Ensure CI passing
- [ ] Run Dialyzer (zero warnings)
- [ ] Run Credo (zero issues)
- [ ] Check test coverage (>95%)
- [ ] Review security (no hardcoded secrets)
- [ ] Tag v0.1.0
- [ ] Publish to Hex.pm
- [ ] Generate and publish docs to HexDocs
Testing Strategy Summary
Test Categories
Unit Tests: Test individual functions in isolation
- All async: true
- Use Supertester for OTP testing
- Mock external dependencies
- Fast (< 1ms per test)
Integration Tests: Test component interactions
- Tagged
:integration
- Use mock codex-rs script
- Test full workflows
- Medium speed (< 100ms per test)
- Tagged
Live Tests: Test with real codex-rs
- Tagged
:live
- Require API key via env var
- Optional (skip in CI)
- Slow (seconds per test)
- Tagged
Coverage Goals
- Overall: 95%+
- Core modules: 100% (Codex, Thread, Exec)
- Types: 100% (Events, Items, Options)
- Utilities: 90%+
Test Infrastructure
Supertester Setup:
# test/support/supertester_helpers.ex
defmodule CodexSdk.SupertesterHelpers do
use Supertester
def with_mock_exec(fun) do
# Helper to mock Exec GenServer
end
def mock_events(events) do
# Helper to emit mock events
end
end
Mox Setup:
# test/support/mocks.ex
Mox.defmock(CodexSdk.ExecMock, for: CodexSdk.ExecBehaviour)
Milestones
M1: Types Complete (End of Sprint 1)
- [x] All TypedStructs defined
- [x] JSON encoding working
- [x] Module stubs created
- [x] Tests passing
M2: Exec Working (End of Sprint 2)
- [ ] Exec GenServer functional
- [ ] Port communication working
- [ ] Event parsing complete
- [ ] Integration tests passing
M3: API Complete (End of Sprint 3)
- [ ] Blocking turns working
- [ ] Streaming turns working
- [ ] All options supported
- [ ] Full test coverage
M4: Production Ready (End of Sprint 4)
- [ ] All tests passing
- [ ] Documentation complete
- [ ] Examples working
- [ ] Published to Hex.pm
Risk Management
Known Risks
codex-rs changes: TypeScript SDK may change
- Mitigation: Pin to specific version, test with multiple versions
Port communication complexity: Edge cases in JSONL parsing
- Mitigation: Comprehensive parser tests, fuzzing
Process cleanup: Resource leaks on crashes
- Mitigation: Chaos tests, supervision review
Test determinism: Flaky tests with OTP
- Mitigation: Supertester, no Process.sleep, proper sync
Contingency Plans
- Behind schedule: Cut streaming mode to v1.1
- Critical bugs: Focus on blocking mode first
- API changes: Document breaking changes, version clearly
Definition of Done
A feature is complete when:
- [ ] Tests written first (TDD)
- [ ] Tests passing (async: true)
- [ ] Code documented (@doc, @moduledoc, @spec)
- [ ] Dialyzer clean
- [ ] Credo clean
- [ ] Coverage maintained (>95%)
- [ ] Reviewed by team
- [ ] Examples updated
- [ ] Changelog updated
Success Metrics
Quantitative
- 95%+ test coverage
- 100% Dialyzer clean
- 100% Credo compliant
- < 5 minute full test suite
- < 50ms average test time
- Zero flaky tests
Qualitative
- Clear, readable code
- Comprehensive documentation
- Helpful error messages
- Easy to use API
- Well-commented examples
- Positive community feedback
Next Steps After MVP
v0.2.0 Features
- [ ] Telemetry documentation
- [ ] Supervision tree examples
- [ ] Performance benchmarks
- [ ] Phoenix LiveView integration
v0.3.0 Features
- [ ] Persistent event logging
- [ ] Custom event handlers API
- [ ] Advanced streaming modes
- [ ] WebSocket support (if codex-rs adds it)
v1.0.0 Criteria
- [ ] 6+ months in production
- [ ] Stable API (no breaking changes)
- [ ] Comprehensive docs
- [ ] Active maintenance
- [ ] Community adoption