Overview
This guide covers performance tuning for ZenWebsocket connections. Each parameter affects latency, throughput, memory usage, or reliability—understanding these tradeoffs helps optimize for your specific use case.
Configuration Parameters
Timeout Settings
| Parameter | Default | Description |
|---|---|---|
timeout | 5000ms | Connection establishment timeout |
request_timeout | 30000ms | Timeout for correlated request/response |
heartbeat_interval | 30000ms | Interval between heartbeat pings |
Tuning guidance:
# Low-latency trading (fast fail, quick detection)
{:ok, client} = Client.connect(url,
timeout: 3000, # Fail fast on connection issues
request_timeout: 5000, # Don't wait long for responses
heartbeat_interval: 10_000 # Detect disconnects quickly
)
# High-latency networks (more tolerance)
{:ok, client} = Client.connect(url,
timeout: 15_000, # Allow for slow networks
request_timeout: 60_000, # Accommodate slow API responses
heartbeat_interval: 60_000 # Reduce overhead
)Reconnection Settings
| Parameter | Default | Description |
|---|---|---|
retry_count | 3 | Maximum reconnection attempts |
retry_delay | 1000ms | Base delay for exponential backoff |
max_backoff | 30000ms | Maximum delay between attempts |
reconnect_on_error | true | Enable automatic reconnection |
restore_subscriptions | true | Restore subscriptions after reconnect |
Exponential backoff formula:
delay = min(retry_delay * 2^attempt, max_backoff)
Attempt 0: 1000ms
Attempt 1: 2000ms
Attempt 2: 4000ms
Attempt 3: 8000ms (capped at 30000ms if max_backoff: 30_000)Tuning guidance:
# Production trading (aggressive reconnection)
{:ok, client} = Client.connect(url,
retry_count: 10, # Many attempts before giving up
retry_delay: 500, # Start with short delays
max_backoff: 10_000, # Cap at 10 seconds
reconnect_on_error: true,
restore_subscriptions: true
)
# Adapter-managed reconnection (disable internal)
{:ok, client} = Client.connect(url,
reconnect_on_error: false # Adapter handles all reconnection
)Latency Monitoring
| Parameter | Default | Description |
|---|---|---|
latency_buffer_size | 100 | Samples retained for percentile calculations |
The LatencyStats module maintains a circular buffer of request latencies for p50/p99 calculations.
Memory impact: Each latency sample stores a microsecond integer (~8 bytes for the raw value). With Erlang term overhead, actual memory usage is higher—expect ~16-24 bytes per sample in practice. For 100 samples, budget ~2 KB per connection.
# High-precision latency tracking
{:ok, client} = Client.connect(url,
latency_buffer_size: 1000 # More samples for smoother percentiles
)
# Memory-constrained environment
{:ok, client} = Client.connect(url,
latency_buffer_size: 25 # Minimal samples
)Retrieving latency stats:
# Get summary with p50, p99, last sample, count
stats = Client.get_latency_stats(client)
# => %{p50: 45, p99: 120, last: 52, count: 100}Rate Limiter Tuning
The RateLimiter module implements a token bucket algorithm supporting different exchange patterns.
Configuration Options
config = %{
tokens: 100, # Bucket capacity
refill_rate: 10, # Tokens added per interval
refill_interval: 1000, # Interval in milliseconds
max_queue_size: 100, # Maximum queued requests
request_cost: &MyModule.cost_function/1
}
{:ok, limiter} = RateLimiter.init(:my_limiter, config)Exchange-Specific Cost Functions
Different exchanges use different rate limit models:
# Deribit: Credit-based (methods have different costs)
config = %{
tokens: 10_000, # Deribit gives 10k credits
refill_rate: 1000, # Refills 1000/second
refill_interval: 1000,
request_cost: &RateLimiter.deribit_cost/1
}
# Built-in cost function:
# - public/* methods: 1 credit
# - private/get_* methods: 5 credits
# - private/set_* methods: 10 credits
# - private/buy, private/sell: 15 credits
# Binance: Weight-based
config = %{
tokens: 1200, # 1200 weight per minute
refill_rate: 20, # Refill 20 per second
refill_interval: 1000,
request_cost: &RateLimiter.binance_cost/1
}
# Simple fixed-rate (Coinbase, etc.)
config = %{
tokens: 10, # 10 requests
refill_rate: 10, # Full refill
refill_interval: 1000, # Per second
request_cost: &RateLimiter.simple_cost/1
}Pressure Levels and Backpressure
The rate limiter tracks queue pressure and provides suggested delays:
| Pressure Level | Queue Fill | Suggested Delay |
|---|---|---|
:none | < 25% | 0ms |
:low | 25-50% | 1× refill_interval |
:medium | 50-75% | 2× refill_interval |
:high | 75%+ | 4× refill_interval |
Using backpressure signals:
{:ok, status} = RateLimiter.status(:my_limiter)
# => %{tokens: 50, queue_size: 30, pressure_level: :medium, suggested_delay_ms: 2000}
if status.suggested_delay_ms > 0 do
Process.sleep(status.suggested_delay_ms)
endTelemetry Events
ZenWebsocket emits telemetry events for monitoring. Attach handlers for observability.
Available Events
| Event | Measurements | Metadata |
|---|---|---|
[:zen_websocket, :connection, :upgrade] | connect_time_ms | url |
[:zen_websocket, :heartbeat, :pong] | rtt_ms | type |
[:zen_websocket, :rate_limiter, :consume] | tokens_remaining, cost | name |
[:zen_websocket, :rate_limiter, :refill] | tokens_before, tokens_after, refill_rate | name |
[:zen_websocket, :rate_limiter, :queue] | queue_size, cost | name |
[:zen_websocket, :rate_limiter, :queue_full] | queue_size | name |
[:zen_websocket, :rate_limiter, :pressure] | queue_size, ratio | name, level, previous_level |
[:zen_websocket, :request_correlator, :track] | count | id, timeout_ms |
[:zen_websocket, :request_correlator, :resolve] | count, round_trip_ms | id |
[:zen_websocket, :request_correlator, :timeout] | count | id |
[:zen_websocket, :request_correlator, :fail_all] | count | id, reason |
[:zen_websocket, :subscription_manager, :add] | count | channel |
[:zen_websocket, :subscription_manager, :remove] | count | channel |
[:zen_websocket, :subscription_manager, :restore] | channel_count | channels |
[:zen_websocket, :pool, :route] | health, pool_size | selected |
[:zen_websocket, :pool, :health] | pool_size, avg_health | |
[:zen_websocket, :pool, :failover] | attempt | failed_pid, reason |
Setting Up Telemetry Handlers
Important: Call setup/0 in your application's start/2 callback to attach handlers before any connections are made:
# lib/my_app/application.ex
def start(_type, _args) do
MyApp.TelemetryHandler.setup()
children = [
# ... your supervision tree
]
Supervisor.start_link(children, strategy: :one_for_one)
enddefmodule MyApp.TelemetryHandler do
require Logger
def setup do
events = [
[:zen_websocket, :rate_limiter, :pressure],
[:zen_websocket, :request_correlator, :resolve],
[:zen_websocket, :request_correlator, :timeout]
]
:telemetry.attach_many(
"my-app-zen-websocket",
events,
&__MODULE__.handle_event/4,
nil
)
end
def handle_event([:zen_websocket, :rate_limiter, :pressure], measurements, metadata, _config) do
Logger.warning("Rate limiter pressure: #{metadata.level}, queue: #{measurements.queue_size}")
end
def handle_event([:zen_websocket, :request_correlator, :resolve], measurements, metadata, _config) do
if measurements.round_trip_ms > 1000 do
Logger.warning("Slow request: #{inspect(metadata.id)} took #{measurements.round_trip_ms}ms")
end
end
def handle_event([:zen_websocket, :request_correlator, :timeout], _measurements, metadata, _config) do
Logger.error("Request timeout: #{inspect(metadata.id)}")
end
endMemory Characteristics
Per-Connection Baseline
| Component | Memory |
|---|---|
| Client GenServer state | ~1-2 KB |
| Gun connection | ~2-3 KB |
| LatencyStats buffer (100 samples) | ~2 KB |
| SubscriptionManager (10 channels) | ~500 bytes |
| RequestCorrelator (empty) | ~200 bytes |
| Total idle connection | ~6-8 KB |
Variable Memory Components
| Component | Growth Factor |
|---|---|
| RequestCorrelator | ~200 bytes per pending request |
| RateLimiter queue | ~100 bytes per queued request |
| SubscriptionManager | ~50 bytes per subscription |
| LatencyStats | 8 bytes per sample up to buffer_size |
Memory Optimization
# Memory-constrained configuration
{:ok, client} = Client.connect(url,
latency_buffer_size: 25, # Smaller latency buffer
request_timeout: 10_000 # Shorter timeout = fewer pending requests
)
# Initialize rate limiter with smaller queue
RateLimiter.init(:my_limiter, %{
tokens: 100,
refill_rate: 10,
refill_interval: 1000,
max_queue_size: 25, # Smaller queue, fail faster
request_cost: &RateLimiter.simple_cost/1
})Common Tuning Scenarios
High-Frequency Trading
Optimize for lowest latency, fast failure detection:
{:ok, client} = Client.connect(url,
timeout: 2000,
request_timeout: 3000,
heartbeat_interval: 5000,
retry_count: 3,
retry_delay: 100,
max_backoff: 1000,
latency_buffer_size: 500
)Market Data Collection
Optimize for reliability, handle reconnection gracefully:
{:ok, client} = Client.connect(url,
timeout: 10_000,
request_timeout: 30_000,
heartbeat_interval: 30_000,
retry_count: 20,
retry_delay: 1000,
max_backoff: 60_000,
restore_subscriptions: true
)Resource-Constrained Environment
Minimize memory and CPU overhead:
{:ok, client} = Client.connect(url,
heartbeat_interval: 60_000, # Less frequent heartbeats
latency_buffer_size: 10, # Minimal latency tracking
retry_count: 3 # Limited retries
)Debugging Performance Issues
Enable Debug Logging
{:ok, client} = Client.connect(url, debug: true)This logs detailed connection lifecycle events including Gun operations, WebSocket upgrades, and message timing.
Check Connection State
# Connection state (atom)
state = Client.get_state(client)
# => :connected | :connecting | :disconnected
# Latency metrics
stats = Client.get_latency_stats(client)
# => %{p50: 45, p99: 120, last: 52, count: 100}
# Heartbeat health
health = Client.get_heartbeat_health(client)
# => %{failure_count: 0, last_heartbeat_at: -576460748, config: :disabled, timer_active: false}
# Note: last_heartbeat_at is System.monotonic_time(:millisecond), not a wall-clock DateTime
# Connection metrics
metrics = Client.get_state_metrics(client)
# => %{subscriptions_size: 12, pending_requests_size: 5, state_memory: 1024, ...}Monitor Rate Limiter
{:ok, status} = RateLimiter.status(:my_limiter)
IO.inspect(status)
# => %{tokens: 85, queue_size: 5, pressure_level: :low, suggested_delay_ms: 1000}Summary
| Goal | Key Parameters |
|---|---|
| Lower latency | Reduce timeout, request_timeout, heartbeat_interval |
| Higher reliability | Increase retry_count, max_backoff |
| Less memory | Reduce latency_buffer_size, max_queue_size |
| Better observability | Attach telemetry handlers, enable debug: true |
| Prevent rate limits | Configure appropriate request_cost function, monitor pressure |
Related Guides
- Building Exchange Adapters - Build production adapters with reconnection and state restoration