ZenWebsocket Supervision Strategy
View SourceOverview
ZenWebsocket provides optional supervision for WebSocket client connections, ensuring resilience and automatic recovery from failures. This is critical for financial trading systems where connection stability directly impacts order execution and risk management.
Important: As a library, ZenWebsocket does not start any supervisors automatically. You must explicitly add supervision to your application's supervision tree when needed.
Architecture
Your Application Supervisor
├── ZenWebsocket.ClientSupervisor (Optional DynamicSupervisor)
│ ├── Client GenServer 1
│ ├── Client GenServer 2
│ └── Client GenServer N
└── Your other children...Key Components
1. ClientSupervisor (ZenWebsocket.ClientSupervisor)
- DynamicSupervisor for managing client connections
- Restart strategy:
:one_for_one(isolated failures) - Maximum 10 restarts in 60 seconds (configurable)
- Each client runs independently
2. Client GenServer (ZenWebsocket.Client)
- Manages individual WebSocket connections
- Handles Gun process ownership and message routing
- Integrated heartbeat handling
- Automatic reconnection on network failures
Usage Patterns
Pattern 1: No Supervision (Simple/Testing)
# Direct connection without supervision
{:ok, client} = ZenWebsocket.Client.connect("wss://example.com")
# Use the client
ZenWebsocket.Client.send_message(client, "Hello")
# Clean up when done
ZenWebsocket.Client.close(client)Pattern 2: Using ClientSupervisor
First, add the supervisor to your application:
defmodule MyApp.Application do
use Application
def start(_type, _args) do
children = [
# Add the ZenWebsocket supervisor
ZenWebsocket.ClientSupervisor,
# Your other children...
]
Supervisor.start_link(children, strategy: :one_for_one)
end
endThen create supervised connections:
# Basic supervised connection
{:ok, client} = ZenWebsocket.ClientSupervisor.start_client("wss://example.com")
# With configuration
{:ok, client} = ZenWebsocket.ClientSupervisor.start_client("wss://example.com",
retry_count: 10,
heartbeat_config: %{type: :deribit, interval: 30_000}
)Pattern 3: Direct Client Supervision
Add individual clients directly to your supervision tree:
defmodule MyApp.Application do
use Application
def start(_type, _args) do
children = [
# Supervise individual clients
{ZenWebsocket.Client, [
url: "wss://exchange1.com",
id: :exchange1_client,
heartbeat_config: %{type: :deribit, interval: 30_000}
]},
{ZenWebsocket.Client, [
url: "wss://exchange2.com",
id: :exchange2_client
]},
# Your other children...
]
Supervisor.start_link(children, strategy: :one_for_one)
end
endRestart Behavior
Transient Restart Strategy
- Clients are restarted only if they exit abnormally
- Normal shutdowns (via
Client.close/1) don't trigger restart - Crashes and connection failures trigger automatic restart
Failure Scenarios
Network Disconnection
- Client detects connection loss
- Attempts internal reconnection (configurable retries)
- If max retries exceeded, GenServer exits
- Supervisor restarts the client
Process Crash
- Supervisor immediately detects exit
- Starts new client process
- Connection re-established from scratch
Heartbeat Failure
- Client tracks heartbeat failures
- Closes connection after threshold
- Supervisor restarts for fresh connection
Production Considerations
1. Resource Management
- Each supervised client consumes:
- 1 Erlang process (Client GenServer)
- 1 Gun connection process
- Associated memory for state and buffers
2. Restart Limits
- Default: 10 restarts in 60 seconds
- Prevents restart storms
- Adjust based on expected failure patterns
3. Monitoring
# List all supervised clients
clients = ZenWebsocket.ClientSupervisor.list_clients()
# Check client health
health = ZenWebsocket.Client.get_heartbeat_health(client)4. Graceful Shutdown
# Stop a specific client
ZenWebsocket.ClientSupervisor.stop_client(pid)
# Client won't be restarted (normal termination)Best Practices
Use Supervision for Production
- Always use
ClientSupervisor.start_client/2for production - Direct connections only for testing/development
- Always use
Configure Appropriate Timeouts
- Set heartbeat intervals based on exchange requirements
- Configure retry counts for network conditions
Monitor Client Health
- Implement health checks using
get_heartbeat_health/1 - Set up alerts for excessive restarts
- Implement health checks using
Handle Restart Events
- Subscriptions may need re-establishment
- Authentication may need renewal
- Order state should be reconciled
Example: Production Deribit Connection
defmodule TradingSystem.DeribitConnection do
use GenServer
def start_link(opts) do
GenServer.start_link(__MODULE__, opts, name: __MODULE__)
end
def init(opts) do
# Start supervised connection
url = "wss://test.deribit.com/ws/api/v2"
config = [
heartbeat_config: %{type: :deribit, interval: 30_000},
retry_count: 10,
retry_delay: 1000
]
{:ok, client} = ZenWebsocket.ClientSupervisor.start_client(url, config)
# Create adapter with supervised client
adapter = %ZenWebsocket.Examples.DeribitAdapter{
client: client,
authenticated: false,
subscriptions: MapSet.new(),
client_id: opts[:client_id],
client_secret: opts[:client_secret]
}
# Authenticate and subscribe
{:ok, adapter} = ZenWebsocket.Examples.DeribitAdapter.authenticate(adapter)
{:ok, adapter} = ZenWebsocket.Examples.DeribitAdapter.subscribe(adapter, [
"book.BTC-PERPETUAL.raw",
"trades.BTC-PERPETUAL.raw",
"user.orders.BTC-PERPETUAL.raw"
])
{:ok, %{adapter: adapter}}
end
# Handle reconnection events
def handle_info({:gun_down, _, _, _, _}, state) do
# Log disconnection
Logger.warn("Deribit connection lost, supervisor will restart")
{:noreply, state}
end
endSupervision Tree Visualization
YourApp.Supervisor
├── ZenWebsocket.Application
│ └── ZenWebsocket.ClientSupervisor
│ ├── Client_1 (Deribit Production)
│ ├── Client_2 (Deribit Test)
│ └── Client_3 (Binance)
└── YourApp.TradingEngineThe supervision strategy ensures that WebSocket connections remain stable and automatically recover from failures, critical for 24/7 financial trading operations.