Normandy Development Roadmap
View SourceThis document tracks the phased implementation of the Normandy AI agent framework.
Completed Phases ✅
Phase 1-7: Core Foundation
- ✅ Basic agent architecture
- ✅ LLM client integrations
- ✅ Tool/function calling
- ✅ Memory and conversation management
- ✅ Streaming responses
- ✅ Resilience (retry, circuit breaker)
- ✅ Context window management, token counting, summarization
Phase 8: Multi-Agent Coordination ✅
Status: Completed - Commit 7f8d55b
Implemented:
- Agent-to-agent message passing (AgentMessage)
- Sequential orchestration (pipeline pattern)
- Parallel orchestration (concurrent execution)
- Hierarchical coordination (manager-worker)
- Shared context (stateless and GenServer-backed)
- Agent processes with supervision (AgentProcess, AgentSupervisor)
- Fault tolerance with OTP patterns
- 76 new tests, 380 total tests passing
Key Modules:
Normandy.Coordination.AgentMessageNormandy.Coordination.SharedContextNormandy.Coordination.StatefulContext(GenServer + ETS)Normandy.Coordination.SequentialOrchestratorNormandy.Coordination.ParallelOrchestratorNormandy.Coordination.HierarchicalCoordinatorNormandy.Coordination.AgentProcess(GenServer wrapper)Normandy.Coordination.AgentSupervisor(DynamicSupervisor)
Phase 8.5: Integration Testing & Claudio Migration ✅
Status: Completed - 2025-10-26
Implemented:
- Migrated Claudio HTTP client from Tesla to Req
- Fixed orchestrator APIs for simplified usage
- Added streaming callback support (arity-2 callbacks)
- Trimmed integration tests for cost efficiency (56 tests)
- Real Anthropic API integration testing
- End-to-end workflow validation
- Multi-agent coordination tests
- Batch processing and performance tests
- Resilience and caching tests
- Comprehensive test helper utilities
Test Files (Trimmed to essential coverage):
test/integration/agent_tool_execution_flow_test.exs(1 test)test/integration/agent_resilience_integration_test.exs(11 tests)test/integration/agent_context_management_test.exs(12 tests)test/integration/batch_coordination_integration_test.exs(12 tests)test/integration/multi_agent_workflows_test.exs(2 tests)test/integration/llm_caching_integration_test.exs(11 tests)test/integration/end_to_end_scenarios_test.exs(2 tests)test/normandy_integration/basic_agent_test.exs(2 tests)test/normandy_integration/multi_agent_test.exs(2 tests)
Key Features:
NormandyTest.Support.IntegrationHelper- API setup and utilitiesNormandyTest.Support.NormandyIntegrationHelper- Normandy-specific helpers- API key management (supports both
API_KEYandANTHROPIC_API_KEY) - Tag-based test exclusion (
@moduletag :api,@moduletag :integration) - Real-world scenario testing with reduced API costs
Orchestrator Improvements:
ParallelOrchestrator.execute/2- Simple API:execute(agents, input)returns{:ok, [results]}SequentialOrchestrator.execute/2- Simple API:execute(agents, input)returns{:ok, final_result}- Advanced API still available with full
execution_resultmaps - Fixed
extract_resultto return full response maps instead of just chat_message strings
Phase 8.6: Developer Experience Enhancements ✅
Status: Completed - 2025-10-26
Implemented:
- Reactive patterns for concurrent agent execution
- Agent pooling with fault tolerance and overflow handling
- Comprehensive documentation and examples
- 63 new tests (33 for Reactive, 30 for AgentPool)
- Updated README with multi-agent coordination section
Key Modules:
Normandy.Coordination.Reactive- Race, all, some patterns for concurrent executionNormandy.Coordination.AgentPool- Pool manager with checkout/checkin, overflow, and monitoring
Reactive Patterns:
race/3- Return first successful result from multiple agentsall/3- Wait for all agents to complete with optional fail-fastsome/4- Wait for N successful results (quorum pattern)map/3- Transform agent resultswhen_result/3- Conditional execution based on results
Agent Pool Features:
- Transaction-based API with automatic checkout/checkin
- Manual checkout/checkin for advanced use cases
- Configurable pool size with overflow support
- LIFO/FIFO checkout strategies
- Automatic agent replacement on failure
- Pool statistics and monitoring
- Non-blocking checkout with timeout support
Test Coverage:
- 33 comprehensive tests for Reactive patterns
- 30 comprehensive tests for AgentPool
- Total unit tests: 443 (up from 380)
- All tests passing with full coverage
Documentation:
- Added "Multi-Agent Coordination" section to README
- Reactive patterns examples with all options
- Agent pooling examples with configuration
- Agent process lifecycle examples
- Use cases for each pattern
- Updated features list and architecture section
Upcoming Phases 🚀
Phase 9: Observability & Logging
Status: Not Started
Goals:
- Structured logging for agent operations
- Telemetry integration for metrics
- Tracing for agent execution flows
- Debug/replay capabilities for conversations
Key Features:
Normandy.Observability.Logger- Structured loggingNormandy.Observability.Telemetry- Metrics and eventsNormandy.Observability.Tracer- Execution flow tracingNormandy.Observability.Replay- Conversation replay/debug
Integration Points:
:telemetrylibrary for events:loggermetadata for correlation IDs- Distributed tracing support (OpenTelemetry)
- Agent execution timeline visualization
Phase 10: Performance Optimization
Status: Not Started
Goals:
- Response caching layer
- Concurrent tool execution
- Memory optimization for long conversations
- Lazy loading strategies
Key Features:
Normandy.Cache- Response caching with TTLNormandy.Tools.ConcurrentExecutor- Parallel tool callsNormandy.Memory.Optimizer- Conversation pruningNormandy.Loader- Lazy loading for large contexts
Optimizations:
- ETS-based caching for responses
- Parallel tool execution when independent
- Automatic conversation summarization triggers
- On-demand loading of historical context
Phase 11: Production Features
Status: Not Started
Goals:
- Rate limiting per agent/client
- Cost tracking and budget controls
- A/B testing framework for prompts
- Audit logging for compliance
Key Features:
Normandy.RateLimit- Per-agent/client throttlingNormandy.CostTracking- Token/cost monitoringNormandy.Experimentation- A/B testing for promptsNormandy.Audit- Compliance logging
Production Readiness:
- Configurable rate limits (per second/minute/hour)
- Budget alerts and hard stops
- Prompt variant testing framework
- Immutable audit trail for sensitive operations
Phase 12: Developer Experience
Status: Not Started
Goals:
- Mix tasks for common operations
- Development console/REPL enhancements
- Code generation for schemas
- Testing utilities and factories
Key Features:
mix normandy.gen.agent- Agent scaffoldingmix normandy.gen.tool- Tool function generatormix normandy.console- Interactive REPLNormandy.Factory- Test data factories
DX Improvements:
- CLI for agent creation and testing
- Interactive prompt development
- Schema code generation from examples
- Comprehensive test helpers
Development Guidelines
Testing Requirements
- Minimum 80% code coverage
- Integration tests for all coordination patterns
- Property-based tests for complex logic
- Performance benchmarks for critical paths
Documentation Standards
- Moduledoc for all public modules
- @doc for all public functions
- Examples in docstrings
- Integration guides for new features
Commit Message Format
<type>: <subject>
<body>
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>Progress Tracking
| Phase | Status | Tests | Modules | Completion Date |
|---|---|---|---|---|
| 1-7 | ✅ Complete | 304 | ~30 | 2025-10-26 |
| 8 | ✅ Complete | 380 | 38 | 2025-10-26 |
| 8.5 | ✅ Complete | 480 (380+100 integration) | 39 | 2025-10-26 |
| 8.6 | ✅ Complete | 493 (443+56 integration) | 40 | 2025-10-26 |
| 9 | 📋 Planned | - | - | - |
| 10 | 📋 Planned | - | - | - |
| 11 | 📋 Planned | - | - | - |
| 12 | 📋 Planned | - | - | - |
Notes
- All phases should leverage Elixir/OTP primitives where applicable
- Maintain backwards compatibility within major versions
- Prioritize production-readiness and fault tolerance
- Document performance characteristics and tradeoffs