Macula Performance Optimization Guide
View SourceOverview
Macula implements several performance optimizations to achieve high-throughput pub/sub messaging over distributed DHT routing. This guide documents the caching, routing, and rate-limiting mechanisms that enable 10,000+ msg/sec throughput.
Key Modules:
macula_subscriber_cache- Topic→Subscribers caching with TTLmacula_direct_routing- NodeId→Endpoint direct routing tablemacula_gateway_pubsub_router- Optimized message distribution
Performance Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ PubSub Message Flow │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Publisher │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ macula:publish() │ │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Subscriber Cache Layer │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ macula_subscriber_cache │ │ │
│ │ │ • ETS-backed O(1) lookup │ │ │
│ │ │ • TTL-based expiration (default: 5s) │ │ │
│ │ │ • Rate-limiting (default: 2s between DHT queries) │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ └────────┬───────────────────────────────────────────┬───────────┘ │
│ │ │ │
│ Cache Hit Cache Miss │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ Use Cached │ │ Query DHT │ │
│ │ Subscribers │ │ (rate-limited) │ │
│ └────────┬────────┘ └──────────┬──────────┘ │
│ │ │ │
│ └──────────────────┬──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Direct Routing Layer │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ macula_direct_routing │ │ │
│ │ │ • NodeId → Endpoint ETS cache │ │ │
│ │ │ • TTL-based expiration (default: 5m) │ │ │
│ │ │ • Bypasses DHT for known endpoints │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ └────────┬───────────────────────────────────────────┬───────────┘ │
│ │ │ │
│ Route Hit Route Miss │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ Direct QUIC │ │ Use Endpoint from │ │
│ │ Connection │ │ DHT Subscriber Info │ │
│ └────────┬────────┘ └──────────┬──────────┘ │
│ │ │ │
│ └──────────────────┬──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ pubsub_route │ │
│ │ via QUIC │ │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Subscribers │ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘Optimization 1: Subscriber Cache
Problem
DHT lookups add 50-200ms latency per message. For high-frequency topics (10+ msg/sec), this creates unacceptable delays.
Solution
Cache discovered subscribers with TTL-based expiration.
Module: macula_subscriber_cache
%% API
-export([
lookup/1, %% Look up subscribers (returns {ok, List} or {miss, Key})
store/2, %% Store subscribers for topic
invalidate/1, %% Invalidate on topology change
should_query_dht/1,%% Rate-limit check
record_dht_query/1 %% Record query for rate-limiting
]).Data Flow
┌─────────────────────┐
│ lookup(Topic) │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ ETS Lookup │
│ O(1) complexity │
└──────────┬──────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Found & │ │ Found & │ │ Not Found │
│ Valid │ │ Expired │ │ │
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘
│ │ │
▼ ▼ ▼
{ok, Subs} Delete + {miss} {miss, Key}Configuration
| Parameter | Default | Description |
|---|---|---|
ttl_ms | 5000 | Cache entry TTL (5 seconds) |
min_discovery_interval_ms | 2000 | Rate-limit window (2 seconds) |
Performance Impact
| Scenario | Without Cache | With Cache | Improvement |
|---|---|---|---|
| First publish | 50-200ms | 50-200ms | - |
| Repeated publish (same topic) | 50-200ms | <1ms | 50-200x |
| High-frequency (10+ msg/sec) | Blocked | Smooth | Critical |
Optimization 2: Direct Routing Table
Problem
After discovering a subscriber, we still need their endpoint address. Storing this mapping allows direct P2P connections without repeated DHT lookups.
Solution
Cache NodeId→Endpoint mappings for direct QUIC connections.
Module: macula_direct_routing
%% API
-export([
lookup/1, %% Look up endpoint for node
store/2, %% Store node→endpoint mapping
store_from_subscriber/1, %% Store from DHT subscriber info
remove/1, %% Remove stale entry
stats/0 %% Get hit/miss statistics
]).Data Flow
┌─────────────────────┐
│ Route to NodeId │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Direct Routing │
│ Table Lookup │
└──────────┬──────────┘
│
┌─────────────────┼─────────────────┐
│ │
┌─────▼─────┐ ┌─────▼─────┐
│ Hit │ │ Miss │
│ (cached) │ │ │
└─────┬─────┘ └─────┬─────┘
│ │
▼ ▼
Direct QUIC to Use endpoint from
cached endpoint subscriber infoEntry Structure
{NodeId, Endpoint, ExpiresAt}
%% Example:
{<<NodeId:256>>, <<"https://192.168.1.10:4433">>, 1700000000000}Configuration
| Parameter | Default | Description |
|---|---|---|
ttl_ms | 300000 | Route entry TTL (5 minutes) |
cleanup_interval_ms | 60000 | Cleanup frequency (1 minute) |
Performance Impact
| Scenario | Without Direct Routing | With Direct Routing | Improvement |
|---|---|---|---|
| Known subscriber | 10-50ms | <1ms | 10-50x |
| Second message to same node | Full lookup | Direct | Significant |
Optimization 3: Rate-Limited DHT Discovery
Problem
When cache expires during high-frequency publishing, multiple publishes trigger simultaneous DHT queries ("discovery storms").
Solution
Allow only one DHT query per topic within a minimum interval.
Algorithm
┌─────────────────────────────────────────────────────────────────┐
│ Rate-Limiting Flow │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ │
│ │ Cache Miss │ │
│ │ (need DHT query) │ │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ should_query_dht(Topic) │ │
│ │ │ │
│ │ Now = current_time() │ │
│ │ LastQuery = rate_limit_table[Topic] │ │
│ │ │ │
│ │ if (Now - LastQuery) >= min_discovery_interval: │ │
│ │ return true ──────────────────────────┐ │ │
│ │ else: │ │ │
│ │ return false ──────────────┐ │ │ │
│ │ │ │ │ │
│ └──────────────────────────────────┼───────────┼─────────┘ │
│ │ │ │
│ ┌─────────▼───┐ ┌────▼────────┐ │
│ │ Rate-limited│ │Query DHT │ │
│ │ Skip query │ │Store result │ │
│ │ Use best- │ │in cache │ │
│ │ effort │ │ │ │
│ └─────────────┘ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘Configuration
| Parameter | Default | Description |
|---|---|---|
min_discovery_interval_ms | 2000 | Minimum time between DHT queries per topic |
Performance Impact
| Scenario | Without Rate-Limiting | With Rate-Limiting | Improvement |
|---|---|---|---|
| 100 publishes in 1 second (cache expired) | 100 DHT queries | 1 DHT query | 100x reduction |
| DHT load during traffic burst | High | Controlled | Critical |
Combined Performance Results
Benchmark: High-Frequency PubSub
Test Setup:
- 1 Publisher
- 3 Subscribers across mesh
- 100ms publish interval (10 msg/sec)
Results:
| Configuration | Latency (p50) | Latency (p99) | DHT Queries/sec |
|---|---|---|---|
| No optimizations | 150ms | 350ms | 10.0 |
| + Subscriber Cache | 2ms | 15ms | 0.2 |
| + Direct Routing | 1ms | 5ms | 0.2 |
| + Rate Limiting | 1ms | 5ms | 0.05 |
Memory Usage
| Cache | Size per Entry | Max Entries | Max Memory |
|---|---|---|---|
| Subscriber Cache | ~1KB | 1000 topics | ~1MB |
| Direct Routing | ~100B | 10000 nodes | ~1MB |
| Rate Limit Table | ~50B | 1000 topics | ~50KB |
Total overhead: ~2.1MB (bounded, with automatic cleanup)
Tuning Guide
Low-Latency Configuration (Games, Real-time)
%% Shorter TTLs for fresher data
{macula_subscriber_cache, #{
ttl_ms => 2000, %% 2 seconds
min_discovery_interval_ms => 1000 %% 1 second
}},
{macula_direct_routing, #{
ttl_ms => 60000 %% 1 minute
}}High-Throughput Configuration (IoT, Sensors)
%% Longer TTLs for reduced DHT load
{macula_subscriber_cache, #{
ttl_ms => 30000, %% 30 seconds
min_discovery_interval_ms => 10000 %% 10 seconds
}},
{macula_direct_routing, #{
ttl_ms => 600000 %% 10 minutes
}}Dynamic Topology Configuration (Nodes join/leave frequently)
%% Balance freshness and performance
{macula_subscriber_cache, #{
ttl_ms => 5000, %% 5 seconds (default)
min_discovery_interval_ms => 2000 %% 2 seconds (default)
}},
{macula_direct_routing, #{
ttl_ms => 120000 %% 2 minutes
}}Monitoring
Cache Statistics
%% Get subscriber cache stats
Stats = macula_subscriber_cache:stats().
%% Returns:
%% #{
%% hits => 1234,
%% misses => 56,
%% hit_rate => 95.6,
%% table_size => 42,
%% rate_limited => 100
%% }
%% Get direct routing stats
RouteStats = macula_direct_routing:stats().
%% Returns:
%% #{
%% hits => 5678,
%% misses => 12,
%% stores => 90,
%% table_size => 15
%% }Key Metrics to Monitor
| Metric | Target | Action if Below Target |
|---|---|---|
| Cache Hit Rate | >90% | Increase ttl_ms |
| Rate Limited Count | Stable | Normal behavior |
| Table Size | Bounded | Check for leaks |
Implementation Details
Thread Safety
Both caches use ETS with {read_concurrency, true} for lock-free reads:
Table = ets:new(?TABLE, [
named_table,
set,
public, %% Direct reads from any process
{read_concurrency, true} %% Optimized for concurrent reads
]).Cleanup Strategy
Automatic periodic cleanup removes expired entries:
handle_info(cleanup, State) ->
Now = erlang:system_time(millisecond),
%% Match spec: delete where ExpiresAt < Now
MatchSpec = [{{'$1', '_', '$2'}, [{'<', '$2', Now}], [true]}],
ets:select_delete(?TABLE, MatchSpec),
%% Schedule next cleanup
erlang:send_after(?CLEANUP_INTERVAL_MS, self(), cleanup),
{noreply, State}.Future Optimizations (Planned)
- Adaptive TTL - Adjust cache TTL based on topic publish frequency
- Predictive Prefetch - Pre-warm cache for topics with predictable patterns
- Bloom Filter - Fast negative lookup for non-existent topics
- Connection Pooling - Keep QUIC streams warm for hot paths
Last Updated: 2025-11-26 Macula Version: 0.10.1 Status: ✅ Production-ready