Changelog
View SourceAll notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.7.9] - 2025-11-16
Added
- Gateway Supervision Refactoring: Implemented proper OTP supervision tree
- New 3-tier architecture:
macula_gateway_sup(root) supervisesmacula_gateway_quic_server,macula_gateway,macula_gateway_workers_sup - Added
macula_gateway_quic_server.erl- Dedicated QUIC transport layer (248 LOC, 17 tests) - Added
macula_gateway_workers_sup.erl- Supervises business logic workers (152 LOC, 24 tests) - Added
macula_gateway_clients.erl- Renamed frommacula_gateway_client_manager(clearer naming) - Circular dependency resolution via
set_gateway/2callback pattern rest_for_onesupervision strategy for controlled fault isolation
- New 3-tier architecture:
Changed
- Gateway Architecture: Refactored from manual process management to supervised architecture
- Gateway now finds siblings via supervisor instead of starting them manually
- Simplified
macula_gatewayinit/1 - usesfind_parent_supervisor/0andfind_sibling/2 - Removed manual lifecycle management - supervisor handles cleanup
- Updated
macula_gateway_sup.erlto be root supervisor (was workers supervisor) - All gateway tests updated for new supervision tree (106 tests, 0 failures)
Fixed
- CRITICAL: Gateway now actually USES DHT-routed pub/sub (v0.7.8 had the code but wasn't calling it!)
- Bug: Gateway's
handle_publishwas still using v0.7.7 endpoint-based routing - Impact: v0.7.8 protocol infrastructure existed but gateway bypassed it entirely
- Root cause:
handle_publish(macula_gateway.erl:885-943) never calledmacula_pubsub_routing - Solution: Rewrote
handle_publishto usemacula_pubsub_routing:wrap_publishand send viapubsub_routemessages - Flow: Gateway now queries DHT for
node_id(not endpoint), wraps PUBLISH inpubsub_route, sends via mesh connection manager - Result: Messages now actually route via multi-hop Kademlia DHT to remote subscribers
- Bug: Gateway's
- Fixed test failures in
macula_connection_tests- replaced invalidconnectedmessage type withsubscribe - Fixed edoc warning in
macula_gateway_sup.erl- replaced markdown code fence with HTML pre tags for proper documentation generation
Improved
- Fault Tolerance: Automatic recovery from gateway/QUIC/worker crashes
- Production Stability: Proper OTP supervision with configurable restart strategies
- Code Organization: Clean separation between transport (QUIC), coordination (gateway), and business logic (workers)
- Testability: Each module tested independently with comprehensive coverage
Technical Details
- v0.7.8 added
pubsub_routeprotocol + routing modules but gateway never used them - v0.7.9 integrates the v0.7.8 infrastructure into gateway's publish flow
- This completes the DHT-routed pub/sub implementation started in v0.7.8
- Supervision refactoring provides +2/10 scalability improvement (foundational infrastructure)
- Enables future optimizations: process pools, connection pooling, horizontal scaling
[0.7.8] - 2025-11-16
Fixed
- CRITICAL: Implemented multi-hop DHT routing for pub/sub to fix matchmaking
- Bug: v0.7.7 gateway queried DHT but routed to endpoints, which failed for NAT peers
- Impact: Matchmaking still broken - messages couldn't reach subscribers behind NAT
- Root cause: Split-brain architecture - subscribers register locally but routing via gateway
- Solution: Multi-hop Kademlia DHT routing (same pattern as RPC routing)
Added
Protocol Layer: New
pubsub_routemessage type (0x13)- Wraps PUBLISH messages for multi-hop routing through mesh
- Fields:
destination_node_id,source_node_id,hop_count,max_hops,topic,payload - Protocol encoder/decoder support with validation
- 8 encoder tests + 3 decoder tests added
Routing Module:
macula_pubsub_routing.erl(NEW - 115 LOC)- Stateless routing logic for pub/sub messages
wrap_publish/4- Wraps PUBLISH in routing enveloperoute_or_deliver/3- Routes to next hop or delivers locallyshould_deliver_locally/2- Checks if destination matches- TTL protection via
max_hops(default: 10) - 14 comprehensive tests (all passing)
Gateway Integration: Enhanced
macula_gateway.erl- Added
handle_decoded_messageclause forpubsub_routemessages - Routes via XOR distance to next hop OR delivers locally
handle_pubsub_route_deliver/2- Unwraps and delivers to local subscribersforward_pubsub_route/3- Forwards to next hop through mesh
- Added
Pub/Sub Handler: Updated
macula_pubsub_dht.erlroute_to_subscribers/5now uses actual DHT routing (was TODO stub)- Extracts subscriber
node_id(not endpoint) from DHT results - Wraps PUBLISH in
pubsub_routeenvelope - Sends via connection manager which routes through gateway
Technical Details
v0.7.7 Architecture (BROKEN):
- ❌ Publisher queries DHT for subscriber endpoints
- ❌ Tries to route directly to endpoints
- ❌ Fails for NAT peers (can't accept connections)
- ❌ Matchmaking stuck on "Looking for opponent..."
v0.7.8 Architecture (FIXED):
- ✅ Publisher queries DHT for subscriber node IDs
- ✅ Wraps PUBLISH in
pubsub_routeenvelope - ✅ Routes via multi-hop Kademlia (same as RPC)
- ✅ Works with relay OR direct connections
- ✅ Matchmaking succeeds across NAT peers
Message Flow:
Publisher Gateway Node A Subscriber
| | | |
|--pubsub_route---------->| | |
| dest: Subscriber |--pubsub_route----->| |
| topic: "matchmaking" | (forward closer) |--pubsub_route------->|
| payload: {msg} | | |
| | | | Deliver locallyTests
- Protocol encoder: 49 tests (8 new for pubsub_route)
- Protocol decoder: 35 tests (3 new for pubsub_route)
- Pub/sub routing: 14 tests (all passing)
- wrap_publish envelope creation
- should_deliver_locally checks
- route_or_deliver decision logic
- TTL exhaustion handling
- No-route error handling
Architecture Documentation
- Added
architecture/dht_routed_pubsub.mdwith complete design - Future refactoring note: Consider unifying RPC and pub/sub routing modules (nearly identical logic)
This completes the DHT-routed pub/sub implementation and should enable working matchmaking.
[0.7.7] - 2025-11-15
Fixed
- CRITICAL: Gateway pub/sub now queries DHT for remote subscribers
- Bug: Gateway only checked local subscriptions, never queried DHT for remote subscribers
- Impact: Distributed pub/sub and matchmaking completely broken - remote peers couldn't receive messages
- Root cause:
handle_publishonly calledmacula_gateway_pubsub:get_subscribers(local streams only) - Fix Phase 1: Added endpoint → stream PID tracking in
macula_gateway_client_manager- New state field:
endpoint_to_stream :: #{binary() => pid()} - New API:
get_stream_by_endpoint/2 - Updated
store_client_stream/4to track endpoints - Updated
remove_client/2to clean up endpoint mappings
- New state field:
- Fix Phase 2: Modified
handle_publishto query DHT- Queries local subscribers (existing behavior)
- Queries DHT for remote subscribers via
crypto:hash(sha256, Topic) - Converts remote endpoints to stream PIDs using client_manager
- Combines local + remote and delivers to all
- Fix Phase 3: Added
macula_gateway_dht:lookup_value/1- Synchronous lookup from local DHT storage
- Calls
macula_routing_server:find_value/3with K=20 - Returns
{ok, [Subscriber]}or{error, not_found}
- Tests: 90 tests passing (39 client_manager + 49 gateway + 7 endpoint + 5 pub/sub DHT)
This completes the distributed pub/sub fix and enables working matchmaking across multiple peer containers.
Technical Details
Before v0.7.7:
- ❌ Gateway only queried
macula_gateway_pubsub(local subscriptions) - ❌ Remote subscribers stored in DHT but never looked up
- ❌ Pub/sub messages only delivered to local streams
- ❌ Multi-peer matchmaking broken
After v0.7.7:
- ✅ Gateway queries both local + DHT for subscribers
- ✅ Remote endpoints resolved to stream PIDs via endpoint tracking
- ✅ Messages delivered to all subscribers (local + remote)
- ✅ Multi-peer matchmaking works correctly
The architecture remains hub-and-spoke (v0.7.x):
- All peers connect to gateway
- Gateway routes all pub/sub messages
- Subscriptions stored in DHT for discovery
- Gateway has stream PIDs for all connected peers
[0.8.0] - TBD (Q2 2025)
Planned - True Mesh Architecture
- BREAKING: Opportunistic NAT hole punching for direct peer-to-peer connections
- 80% direct P2P connections (cone NAT, no firewall)
- 20% gateway relay fallback (symmetric NAT, strict firewalls)
- True mesh topology (no single point of failure)
- New modules:
macula_nat_discovery,macula_hole_punch,macula_connection_upgrade - Backward compatible with v0.7.x gateway relay architecture
This will transform Macula from hub-and-spoke (star topology) to true decentralized mesh.
See architecture/NAT_TRAVERSAL_ROADMAP.md for complete design.
[0.7.6] - 2025-11-15
Fixed
- CRITICAL: Disabled QUIC transport-layer idle timeout causing connection closures
- Root cause: MsQuic default idle timeout of 30 seconds (2x = 60s to closure)
- v0.7.4-0.7.5 application-level PING/PONG worked but didn't reset QUIC transport timer
- Added
idle_timeout_ms => 0to both client connection and gateway listener options - Setting to 0 disables QUIC idle timeout entirely
- Connections now stay alive indefinitely (application PING/PONG provides health checks)
- Modified:
macula_quic:connect/4andmacula_quic:listen/2
This completes the connection stability fix started in v0.7.4-0.7.5.
Tests
- Added
test/macula_quic_idle_timeout_tests.erlwith 7 tests- Client connection idle timeout configuration
- Gateway listener idle timeout configuration
- Option structure and value validation
- Defense-in-depth architecture documentation
Technical Details
Defense in Depth approach:
- Transport Layer (v0.7.6): QUIC idle timeout disabled (
idle_timeout_ms => 0) - Application Layer (v0.7.4-0.7.5): PING/PONG keep-alive every 30 seconds
- Result: Connections stay alive + health monitoring
Previous versions had application keep-alive but QUIC transport still enforced 30s idle timeout independently.
[0.7.5] - 2025-11-15
Fixed
- CRITICAL: Gateway PING message handler missing, preventing keep-alive from working
- v0.7.4 implemented keep-alive on edge peer side only
- Gateway had no handler for incoming PING messages
- Result: PINGs sent but never acknowledged, connections still timed out after 2 minutes
- Added
handle_decoded_message({ok, {ping, PingMsg}}, ...)to gateway - Gateway now responds with PONG to all incoming PING messages
- Keep-alive now works bidirectionally (edge peer ↔ gateway)
- Also added PONG message handler to gateway for completeness
This completes the keep-alive implementation started in v0.7.4.
Technical Details
The keep-alive flow now works correctly:
- Edge peer timer fires every 30 seconds (configurable)
- Edge peer sends PING to gateway
- Gateway receives PING and responds with PONG (new in v0.7.5)
- Edge peer receives PONG confirmation
- QUIC connection stays alive (no idle timeout)
Without this fix, PINGs were sent but ignored, causing connections to timeout despite v0.7.4's implementation.
[0.7.4] - 2025-11-15
Fixed
- CRITICAL: Configurable keep-alive mechanism to prevent QUIC connection timeouts
- PING/PONG message support in
macula_connection - Default keep-alive interval: 30 seconds (configurable)
- Keep-alive enabled by default (can be disabled via options)
- Automatic PONG response to incoming PING messages
- Configuration via
macula_connection:default_config/0 - Prevents 2-minute connection timeout that broke distributed matchmaking
- Added 6 tests for keep-alive functionality (all passing)
- PING/PONG message support in
This is a critical fix for production deployments where QUIC connections timeout after ~2 minutes of inactivity, breaking pub/sub and matchmaking.
Configuration
Enable/disable keep-alive:
%% Enable with custom interval (milliseconds)
Opts = #{
keepalive_enabled => true,
keepalive_interval => 30000 %% 30 seconds
}.
%% Disable keep-alive
Opts = #{
keepalive_enabled => false
}.
%% Use defaults (enabled, 30 second interval)
DefaultConfig = macula_connection:default_config().Architecture Note
v0.7.4 maintains hub-and-spoke (star) topology:
- Edge peers connect to gateway (not each other)
- Gateway routes all messages (relay architecture)
- Gateway is single point of failure (by design for now)
- DHT routing table exists but routing happens at gateway
- True peer-to-peer mesh deferred to v0.8.0 (NAT traversal required)
[0.7.3] - 2025-11-15
Fixed
- CRITICAL: Fixed DHT routing table address serialization crash in
macula_gateway_dht- Bug: Gateway stored parsed address tuples
{{127,0,0,1}, 9443}in DHT instead of binary strings - Impact: When FIND_VALUE replies tried to serialize node addresses, msgpack returned error
{:error, {:badarg, {{127,0,0,1}, 9443}}} - Root cause:
macula_gateway.erl:522usedAddress(tuple fromparse_endpoint/1) instead ofEndpoint(binary string) - Error chain: DHT stored tuples → encode_node_info extracted tuples → msgpack:pack failed → byte_size crashed
- Symptoms: Gateway crashed with "ArgumentError: 1st argument not a bitstring" when peers queried DHT
- Fix: Store original
Endpointbinary string in DHT routing table instead of parsed tuple - Added test:
dht_address_serialization_testdocuments bug and validates fix
- Bug: Gateway stored parsed address tuples
This is a critical fix for distributed matchmaking and service discovery. Without it, DHT queries crash the gateway.
[0.7.2] - 2025-11-15
Fixed
- CRITICAL: Fixed gateway crash in
parse_endpoint/1when DNS resolution fails- Bug:
inet:getaddr/2error tuple was not handled, causing ArgumentError when passed tobyte_size/1 - Impact: Gateway crashed repeatedly, closing all client connections and preventing pub/sub communication
- Symptoms: "Failed to publish to topic: :closed", "Failed to send STORE for subscription: :closed"
- Fix: Added proper error handling with localhost fallback when DNS resolution fails
- Now returns
{{127,0,0,1}, Port}fallback instead of crashing
- Bug:
This is a critical fix for production deployments where endpoint DNS resolution may fail.
[0.7.1] - 2025-11-15
Fixed
- CRITICAL: Fixed ArithmeticError in
macula_pubsub_handlermessage ID handling- Bug: Was assigning binary MsgId to counter instead of integer NewCounter
- Impact: Caused pub/sub to crash on second publish attempt with "bad argument in arithmetic expression"
- Fix: Corrected destructuring in line 300 to use
{_MsgId, NewCounter}instead of{MsgIdCounter, _} - Now properly increments integer counter instead of trying to do arithmetic on binary
This is a critical fix for anyone using pub/sub functionality in v0.7.0.
0.7.0 - 2025-11-15
Changed
- BREAKING: Major nomenclature refactoring for clarity and industry alignment
- Renamed
macula_connection→macula_peer(mesh participant facade - high-level API) - Renamed
macula_connection_manager→macula_connection(QUIC transport layer - low-level) - Follows industry standards used by libp2p, IPFS, and BitTorrent
- Clear separation:
macula_peer= mesh participant,macula_connection= transport
- Renamed
Added
- Comprehensive transport layer test coverage (36 tests total)
- 11 new tests for message decoding, buffering, URL parsing, and realm normalization
- All tests passing with zero regressions
- Complete v0.7.0 documentation in CLAUDE.md
- Migration guide with specific API examples
- Architecture rationale and benefits
- Status tracking for implementation phases
Migration Guide (0.6.x → 0.7.0)
API Changes:
All high-level mesh operations now use macula_peer instead of macula_connection:
%% Before (0.6.x)
{ok, Client} = macula_connection:start_link(Url, Opts).
ok = macula_connection:publish(Client, Topic, Data).
{ok, SubRef} = macula_connection:subscribe(Client, Topic, Callback).
{ok, Result} = macula_connection:call(Client, Procedure, Args).
%% After (0.7.0)
{ok, Client} = macula_peer:start_link(Url, Opts).
ok = macula_peer:publish(Client, Topic, Data).
{ok, SubRef} = macula_peer:subscribe(Client, Topic, Callback).
{ok, Result} = macula_peer:call(Client, Procedure, Args).Why This Change?
The original naming was confusing:
- ❌
macula_connectionserved both facade AND transport roles - ❌ Mixed high-level mesh operations with low-level QUIC handling
- ❌ Not aligned with P2P industry standards
After v0.7.0:
- ✅
macula_peer= mesh participant (clear high-level API for pub/sub, RPC, DHT) - ✅
macula_connection= QUIC transport (clear low-level transport layer) - ✅ Follows libp2p/IPFS/BitTorrent naming conventions
Note: The macula_client wrapper module has been updated to use macula_peer internally, so if you're using macula_client, no changes are required.
0.6.7 - 2025-11-15
Fixed
- CRITICAL: Fixed all installation examples to use Hex package references instead of git dependencies
- README.md: Changed from git-based to
{:macula, "~> 0.6"}(Elixir) and{macula, "0.6.7"}(Erlang) - HELLO_WORLD.md: Updated to use proper Hex package format
- architecture/macula_http3_mesh_hello_world.md: Fixed tutorial installation examples
- architecture/macula_http3_mesh_rpc_guide.md: Fixed migration guide examples
- All code examples now show proper Hex.pm installation for published package
- README.md: Changed from git-based to
[0.6.6] - 2025-11-15
Fixed
- Fixed navigation links in documentation guides to use ex_doc HTML filenames
- Changed GitHub-style relative paths (
../README.md) to ex_doc HTML references (readme.html) - Fixed all navigation links in EXECUTIVE_SUMMARY.md, COMPARISONS.md, USE_CASES.md, and DEVELOPMENT.md
- Links now work correctly in published Hexdocs without "page not found" errors
- Changed GitHub-style relative paths (
[0.6.5] - 2025-11-15
Changed
- Updated to modern alternative logo (macula-alt-logo.svg) in both README.md and ex_doc
- Changed tutorial greeting to brand-specific "Hello, Macula!" instead of generic greeting
Fixed
- Replaced old color logo with cleaner, more modern alternative logo for better visual appeal
[0.6.4] - 2025-11-15
Changed
- Documentation restructuring - Split README.md into focused landing page with table of contents
- Created
docs/EXECUTIVE_SUMMARY.md- Why Macula and the case for decentralization - Created
docs/COMPARISONS.md- How Macula compares to libp2p, Distributed Erlang, Akka, etc. - Created
docs/USE_CASES.md- Real-world applications across business, IoT, and AI domains - Created
docs/DEVELOPMENT.md- Complete development guide and coding standards - README.md now serves as concise landing page (119 lines vs 372 lines)
- All detailed content accessible via clear table of contents
- Removed Mermaid diagram from README.md (ex_doc doesn't support Mermaid - works on GitHub)
- Created
Fixed
- ex_doc landing page uses HELLO_WORLD.md (tutorial-first approach, no multi-page split)
- Documentation properly links to all new guide documents
- Better first impression for Hex.pm users (logo, quick navigation)
[0.6.3] - 2025-11-15
Fixed
- Removed README.md from ex_doc extras to prevent multi-page split and broken landing page
- Documentation now correctly redirects to API reference page
[0.6.2] - 2025-11-15
Fixed
- ex_doc landing page configuration (
{main, "api-reference"}) - resolved "readme.html not found" error
[0.6.1] - 2025-11-15
Added
- Professional documentation structure for Hex publication
- Architecture diagram in README.md (Mermaid format) showing mesh topology
- Organized documentation: moved 50+ files from root to docs/archive/, docs/development/, docs/planning/
- Created docs/README.md navigation index
- Logo and assets configuration for ex_doc
- Comprehensive Hex package file list (artwork/, docs/, architecture/)
Fixed
- README.md badge rendering (moved badges outside
<div>tag for proper GitHub display) - ex_doc assets configuration (deprecated warning resolved)
- ex_doc landing page configuration (changed
{main, "readme"}to{main, "api-reference"}to fix "readme.html not found" error) - Hex package configuration to include all necessary assets and documentation
- Documentation organization for professional first impression
0.6.0 - 2025-11-15
Changed
- BREAKING: Renamed environment variable from
GATEWAY_REALMtoMACULA_REALMfor better API consistency- All
MACULA_*environment variables now follow consistent naming - Applies to both gateway mode and edge peer mode
- Update your deployment configurations to use
MACULA_REALMinstead ofGATEWAY_REALM
- All
Added
- Comprehensive Kademlia DHT architecture documentation (
docs/KADEMLIA_DHT_ARCHITECTURE.md)- XOR distance metric explanation
- K-bucket routing table details
- DHT operations (PING, STORE, FIND_NODE, FIND_VALUE)
- Iterative lookup algorithm
- Macula-specific adaptations (realm-scoped DHT, HTTP/3 transport)
- Performance characteristics and comparisons
Fixed
- Updated documentation to reflect
MACULA_REALMenvironment variable usage - Updated
entrypoint.sh,Dockerfile.gateway, andconfig/sys.configto useMACULA_REALM
Upcoming in v0.7.0
- Architecture improvement: Separation of
macula_connectionintomacula_peer(high-level mesh API) andmacula_connection(low-level QUIC transport) - See
docs/NOMENCLATURE_PROPOSAL_CONNECTION_TO_PEER.mdanddocs/PEER_CONNECTION_SEPARATION_PLAN.mdfor details - Expected timeline: 4-5 weeks after v0.6.0 release
Migration Guide (0.5.0 → 0.6.0)
If you're using Macula in gateway mode or configuring realm multi-tenancy:
Before (0.5.0):
export GATEWAY_REALM=my-app
After (0.6.0):
export MACULA_REALM=my-app
Elixir/Phoenix runtime.exs:
# Before (0.5.0)
System.put_env("GATEWAY_REALM", realm)
# After (0.6.0)
System.put_env("MACULA_REALM", realm)0.5.0 - 2025-11-14
Added
- Initial public release
- HTTP/3 (QUIC) mesh networking platform
- Gateway mode for accepting incoming connections
- Edge peer mode for mesh participation
- Multi-tenancy via realm isolation
- Pub/Sub messaging with wildcard support
- RPC (request/response) patterns
- Service discovery and advertisement
- mDNS local discovery support
- Process registry via gproc
- Comprehensive documentation
Known Issues
- Gateway mode requires proper TLS certificate configuration
- Certificates must have Subject Alternative Name (SAN) extension
- Docker deployments require proper file ownership (
--chown=app:app)