Hackney 2.x Architecture
View SourceThis document describes the internal architecture of hackney 2.x, including the process-per-connection model, connection pooling, load regulation, and SSL handling.
Overview
Hackney 2.x uses a process-per-connection architecture where each HTTP connection runs in its own gen_statem process. This design provides:
- Clean isolation - Each connection has its own state, no shared mutable state
- Automatic cleanup - Process crashes clean up sockets automatically
- Simple ownership - Socket always owned by connection process
- OTP supervision - Standard supervisor tree for fault tolerance
hackney_sup
├── hackney_manager (connection registry)
├── hackney_conn_sup (connection supervisor)
│ └── hackney_conn [1..N] (connection processes for HTTP/1.1, HTTP/2)
├── hackney_pools_sup (pool supervisor)
│ └── hackney_pool [1..N] (pool processes)
└── hackney_altsvc (Alt-Svc cache for HTTP/3 discovery)
QUIC connections (HTTP/3):
├── hackney_quic.erl (Erlang interface to NIF)
└── hackney_quic NIF (lsquic + BoringSSL, no supervision needed)
└── QuicConn resources (GC-managed, one per QUIC connection)Connection Process (hackney_conn)
Each connection is a gen_statem process that manages:
- TCP/SSL socket
- HTTP protocol state (request/response phases)
- Streaming state (chunked encoding, content-length tracking)
- Owner process monitoring
State Machine
┌─────────────┐
│ created │
└──────┬──────┘
│ connect
▼
┌─────────────┐
┌─────│ connected │─────┐
│ └─────────────┘ │
│ │ │
send_request │ upgrade_to_ssl
│ │ │
▼ │ ▼
┌────────────┐ │ ┌────────────┐
│ on_body │ │ │ connected │ (SSL)
└─────┬──────┘ │ └────────────┘
│ │
finish_send_body │
│ │
▼ │
┌─────────────────┐ │
│ waiting_response│ │
└────────┬────────┘ │
│ │
start_response │
│ │
▼ │
┌────────────┐ │
│ on_status │ │
└─────┬──────┘ │
│ │
▼ │
┌────────────┐ │
│ on_headers │───────┤
└─────┬──────┘ │
│ │
▼ │
┌──────────────┐ │
│ on_resp_body │ │
└───────┬──────┘ │
│ │
│ body done │
└──────────────┘
│
▼
┌────────────┐
│ closing │
└─────┬──────┘
│
▼
[exit]Owner Monitoring
The connection process monitors its owner (the process that checked out the connection). If the owner crashes, the connection terminates automatically, preventing socket leaks.
%% When connection is checked out
MonitorRef = monitor(process, Owner),
%% If owner dies
{'DOWN', MonitorRef, process, Owner, _} -> terminateConnection Pool (hackney_pool)
The pool stores TCP connections only for reuse. SSL connections are never pooled for security reasons (they close after use).
Why TCP-Only Pooling?
- Security - SSL session state should not be shared across requests
- Simplicity - No need to validate SSL session freshness
- Flexibility - TCP connections can be upgraded to SSL when needed
Pool State
-record(state, {
name, %% Pool name
max_connections, %% Global max (legacy, per-host preferred)
keepalive_timeout, %% Max idle time (default 2000ms, max 2000ms)
prewarm_count, %% Connections to maintain per host (default 4)
available = #{}, %% #{Key => [Pid]} - idle TCP connections
in_use = #{}, %% #{Pid => Key} - checked out connections
pid_monitors = #{}, %% #{Pid => MonitorRef}
activated_hosts %% Hosts with prewarm enabled
}).Pool Operations
Checkout: Get an available TCP connection or none
hackney_pool:checkout(Host, Port, Transport, Opts)
%% Returns: {ok, PoolInfo, Pid} | {error, no_pool}Checkin: Return a connection to the pool
hackney_pool:checkin(PoolInfo, Pid)
%% TCP connections are stored, SSL connections are closedKeepalive Timeout
Idle connections are closed after keepalive_timeout (default and max: 2 seconds). This prevents:
- Stale connections that the server has closed
- Resource accumulation from unused connections
- Issues with server-side connection limits
Load Regulation (hackney_load_regulation)
Per-host connection limits prevent overwhelming individual servers. This uses an ETS counting semaphore pattern for lock-free concurrent access.
How It Works
%% ETS table: hackney_host_limits
%% Key: {Host, Port} -> Value: current_count
%% Acquire a slot (blocks with backoff until available or timeout)
acquire(Host, Port, MaxPerHost, Timeout) ->
Count = ets:update_counter(Table, Key, {2, 1}, {Key, 0}),
case Count =< MaxPerHost of
true -> ok;
false ->
ets:update_counter(Table, Key, {2, -1}),
%% Backoff and retry until timeout
end.
%% Release a slot
release(Host, Port) ->
ets:update_counter(Table, Key, {2, -1, 0, 0}).Why ETS Counting Semaphore?
- Lock-free -
ets:update_counteris atomic - Per-host isolation - Different hosts don't block each other
- No process bottleneck - No gen_server call for every request
- Backpressure - Requests wait when limit reached
Configuration
%% Default: 50 concurrent connections per host
hackney:get(URL, [], <<>>, [{max_per_host, 100}]).
%% Per-request timeout for acquiring a slot
hackney:get(URL, [], <<>>, [{checkout_timeout, 5000}]).SSL Upgrade Strategy
HTTPS requests use TCP connection upgrade rather than direct SSL connections:
1. Get TCP connection (from pool or new)
2. Upgrade to SSL in-place: ssl:connect(Socket, SslOpts)
3. Use SSL connection for request
4. Close connection (SSL connections not pooled)
5. Trigger TCP prewarm for next HTTPS requestBenefits
- Connection reuse - Pooled TCP connections can serve HTTP or HTTPS
- Prewarm works for HTTPS - TCP connections ready to upgrade
- Security - SSL state never shared between requests
Code Flow
%% In hackney.erl
connect_pool(Host, Port, Transport, Opts) ->
%% Always checkout as TCP
case hackney_pool:checkout(Host, Port, hackney_tcp, Opts) of
{ok, PoolInfo, Pid} ->
%% Upgrade if HTTPS
case Transport of
hackney_ssl ->
ok = hackney_conn:upgrade_to_ssl(Pid, SslOpts),
{ok, PoolInfo, Pid};
_ ->
{ok, PoolInfo, Pid}
end;
...
end.Protocol Selection
Hackney supports three HTTP protocols: HTTP/1.1, HTTP/2, and HTTP/3 (experimental). The protocol selection is controlled via the protocols option.
Default Protocols
By default, hackney uses [http2, http1]:
%% Default behavior - HTTP/2 preferred, HTTP/1.1 fallback
hackney:get("https://example.com/")The default can be changed via application environment:
%% In sys.config or at runtime
application:set_env(hackney, default_protocols, [http2, http1]).Enabling HTTP/3 (Experimental)
HTTP/3 uses QUIC (UDP transport). To enable HTTP/3:
%% Per-request: opt-in to HTTP/3
hackney:get("https://example.com/", [], <<>>, [
{protocols, [http3, http2, http1]}
]).
%% Application-wide: enable HTTP/3 by default
application:set_env(hackney, default_protocols, [http3, http2, http1]).Important considerations for HTTP/3:
- Experimental - QUIC support is still maturing
- UDP may be blocked - Corporate firewalls often block UDP
Protocol Priority
Protocols are tried in order. With [http3, http2, http1]:
- If QUIC NIF is available and server supports HTTP/3: use HTTP/3
- Otherwise, ALPN negotiates HTTP/2 or HTTP/1.1 over TLS
- Server chooses the highest protocol it supports
Forcing a Single Protocol
%% Force HTTP/1.1 only (no HTTP/2 negotiation)
hackney:get(URL, [], <<>>, [{protocols, [http1]}]).
%% Force HTTP/2 only
hackney:get(URL, [], <<>>, [{protocols, [http2]}]).
%% HTTP/3 only (will fail if QUIC unavailable or server doesn't support it)
hackney:get(URL, [], <<>>, [{protocols, [http3]}]).HTTP/3 and QUIC Architecture
HTTP/3 connections use QUIC (UDP-based transport) via a NIF implementation built on lsquic and BoringSSL.
Event-Driven Architecture
Unlike TCP connections which use one process per connection, QUIC uses an event-driven architecture with Erlang's dirty schedulers:
┌─────────────────────────────────────────────────────────────────┐
│ Owner Process (Erlang) │
│ │
│ 1. connect() → Creates QUIC connection, arms enif_select() │
│ │
│ 2. Receives {select, Resource, Ref, ready_input} │
│ └── Socket has data ready │
│ │
│ 3. Calls hackney_quic:process(ConnRef) │
│ └── Runs on dirty I/O scheduler │
│ └── Receives UDP packets │
│ └── Processes lsquic engine │
│ └── Triggers callbacks (headers, data, etc.) │
│ └── Returns next timeout in ms │
│ │
│ 4. Receives {quic, ConnRef, Event} │
│ └── {connected, Info} │
│ └── {stream_headers, StreamId, Headers, Fin} │
│ └── {stream_data, StreamId, Data, Fin} │
│ └── {closed, Reason} │
│ │
│ 5. Schedules timer: erlang:send_after(TimeoutMs, self(), ...) │
│ └── Calls process() again when timer fires │
└─────────────────────────────────────────────────────────────────┘Why Event-Driven (Not Thread-Per-Connection)?
The QUIC NIF originally used a dedicated I/O thread per connection, but this caused race conditions between thread shutdown and Erlang GC. The event-driven approach eliminates this:
| Aspect | Thread-per-Connection | Event-Driven (Current) |
|---|---|---|
| Synchronization | Complex (atomics, mutexes) | None needed |
| Race conditions | Possible at shutdown | Eliminated |
| GC interaction | Thread must be joined | Resource auto-cleanup |
| Scheduling | OS thread scheduler | Erlang dirty scheduler |
| Resource usage | One thread per conn | Shared dirty schedulers |
NIF Components
┌─────────────────────────────────────────────────────────────────┐
│ hackney_quic.erl │
│ - connect/4: Start QUIC connection │
│ - process/1: Process pending I/O (dirty NIF) │
│ - open_stream/1: Create new HTTP/3 stream │
│ - send_headers/4: Send HTTP/3 request headers │
│ - send_data/4: Send request body │
│ - close/2: Close connection │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ hackney_quic_nif.c │
│ - nif_connect: Creates QuicConn resource, arms enif_select() │
│ - nif_process: Dirty NIF, calls quic_conn_process() │
│ - nif_open_stream: Opens bidirectional stream │
│ - nif_send_headers: Encodes and sends QPACK headers │
│ - nif_send_data: Sends DATA frames on stream │
│ - nif_close: Initiates graceful shutdown │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ quic_conn.c │
│ QuicConn resource: │
│ - lsquic_engine_t *engine (one engine per connection) │
│ - lsquic_conn_t *conn (QUIC connection handle) │
│ - SSL_CTX *ssl_ctx (BoringSSL TLS context) │
│ - int sockfd (UDP socket) │
│ - QuicStream *streams (linked list of active streams) │
│ - ErlNifMutex *mutex (protects engine access) │
│ │
│ quic_conn_process(): │
│ 1. recvfrom() all pending packets │
│ 2. lsquic_engine_packet_in() feeds to lsquic │
│ 3. lsquic_engine_process_conns() triggers callbacks │
│ 4. lsquic_engine_send_unsent_packets() sends responses │
│ 5. Returns next timeout from lsquic_engine_earliest_adv_tick() │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ lsquic (C library) │
│ - QUIC protocol implementation (RFC 9000) │
│ - HTTP/3 framing (RFC 9114) │
│ - QPACK header compression (RFC 9204) │
│ - Uses BoringSSL for TLS 1.3 │
└─────────────────────────────────────────────────────────────────┘Connection Lifecycle
Owner Process NIF/lsquic
│ │
│ hackney_quic:connect(...) │
├────────────────────────────────────►│ Create UDP socket
│ │ Create SSL_CTX
│ │ Create lsquic engine
│ │ lsquic_engine_connect()
│ │ lsquic_engine_process_conns()
│ │ enif_select(READ)
│◄────────────────────────────────────┤ {ok, ConnRef}
│ │
│ {select, _, _, ready_input} │
│◄────────────────────────────────────┤ UDP packet received
│ │
│ hackney_quic:process(ConnRef) │
├────────────────────────────────────►│ [dirty scheduler]
│ │ recvfrom() packets
│ │ lsquic_engine_packet_in()
│ │ lsquic_engine_process_conns()
│ │ └─► on_hsk_done callback
│ {quic, ConnRef, {connected, Info}} │ └─► enif_send()
│◄────────────────────────────────────┤
│ │ enif_select(READ)
│◄────────────────────────────────────┤ TimeoutMs
│ │
│ erlang:send_after(TimeoutMs, ...) │
│ │
│ ... (request/response cycle) ... │
│ │
│ hackney_quic:close(ConnRef, ...) │
├────────────────────────────────────►│ lsquic_conn_close()
│ │ enif_select(STOP)
│ {quic, ConnRef, {closed, normal}} │
│◄────────────────────────────────────┤
│ │
│ (ConnRef garbage collected) │
│ │ quic_conn_destroy()
│ │ lsquic_engine_destroy()
│ │ SSL_CTX_free()
│ │ close(sockfd)
│ │Thread Safety
The QUIC NIF uses minimal synchronization:
| Component | Protection | Reason |
|---|---|---|
conn->engine | Mutex | lsquic is not thread-safe |
conn->destroyed | Atomic CAS | Prevent double-free |
conn->streams | Mutex (via engine) | Modified in callbacks |
| Socket I/O | None | Single caller via dirty scheduler |
The key insight is that process() is always called from the same Erlang process (the owner), and runs on a dirty scheduler. This serializes access naturally.
Resource Cleanup
When the owner process dies or the connection is closed:
- Erlang GC detects no more references to
ConnRef quic_conn_resource_dtor()is calledquic_conn_destroy()closes lsquic and frees resources- No thread joining needed (no I/O thread)
This automatic cleanup prevents resource leaks even if the owner crashes.
HTTP/2 Multiplexing
HTTP/2 connections are handled differently from HTTP/1.1. A single HTTP/2 connection can handle multiple concurrent requests via stream multiplexing.
HTTP/2 Pool Design
The pool maintains a separate map for HTTP/2 connections:
-record(state, {
%% ... existing fields ...
%% HTTP/2 connections: one per host, shared across callers
h2_connections = #{} %% #{Key => Pid}
}).Key differences from HTTP/1.1 pooling:
| Aspect | HTTP/1.1 | HTTP/2 |
|---|---|---|
| Connections per host | Multiple (pool) | One (shared) |
| Checkout behavior | Exclusive access | Shared access |
| Checkin behavior | Return to pool | Keep in pool |
| Request handling | Sequential | Multiplexed streams |
HTTP/2 Connection Flow
hackney:get("https://api.example.com/data")
│
▼
┌─────────────────────────────────────┐
│ 1. Check for existing HTTP/2 conn │
│ checkout_h2(Host, Port, ...) │
│ → Returns {ok, Pid} or none │
└─────────────────┬───────────────────┘
│
┌─────────┴─────────┐
│ │
{ok, Pid} none
(reuse!) │
│ ▼
│ ┌─────────────────────┐
│ │ 2. Normal TCP flow │
│ │ checkout → new │
│ │ → upgrade SSL │
│ └──────────┬──────────┘
│ │
│ ▼
│ ┌─────────────────────┐
│ │ 3. Check protocol │
│ │ get_protocol() │
│ │ → http2 | http1 │
│ └──────────┬──────────┘
│ │
│ ┌─────┴─────┐
│ │ │
│ http2 http1
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ 4. Register H2 │ │
│ │ register_h2()│ │
│ └────────┬────────┘ │
│ │ │
└──────┬──────┘ │
│ │
▼ │
┌─────────────────────────────────────┐
│ 5. Send request │
│ HTTP/2: assign StreamId │
│ HTTP/1.1: send directly │
└─────────────────────────────────────┘Stream Multiplexing in hackney_conn
Each hackney_conn process maintains a map of active HTTP/2 streams:
-record(conn_data, {
%% HTTP/2 state machine (from cowlib)
h2_machine :: tuple(),
%% Active streams: #{StreamId => {Caller, State}}
h2_streams = #{} :: #{
pos_integer() => {gen_statem:from(), atom()}
}
}).When a request arrives:
%% In hackney_conn.erl
do_h2_request(From, Method, Path, Headers, Body, Data) ->
%% 1. Get next stream ID from h2_machine
{ok, StreamId, H2Machine1} = hackney_cow_http2_machine:init_stream(...),
%% 2. Track caller for this stream
Streams = maps:put(StreamId, {From, waiting_response}, Data#conn_data.h2_streams),
%% 3. Send HEADERS frame (and DATA if body present)
HeadersFrame = hackney_cow_http2:headers(StreamId, ...),
Transport:send(Socket, HeadersFrame),
%% 4. Return updated state (caller will receive reply when response arrives)
{keep_state, Data#conn_data{h2_streams = Streams}}.When a response arrives:
%% Response for StreamId received
handle_h2_frame({headers, StreamId, ...}, Data) ->
%% Lookup caller from h2_streams
{From, _State} = maps:get(StreamId, Data#conn_data.h2_streams),
%% Reply to the correct caller
gen_statem:reply(From, {ok, Status, Headers, Body}),
%% Remove completed stream
Streams = maps:remove(StreamId, Data#conn_data.h2_streams),
{ok, Data#conn_data{h2_streams = Streams}}.Benefits of HTTP/2 Multiplexing
- Reduced latency - No connection setup for subsequent requests
- Better resource usage - One TCP connection instead of many
- Head-of-line blocking avoided - Responses can arrive out of order
- Server efficiency - Servers prefer fewer connections with more streams
ALPN Protocol Negotiation
HTTP/2 is negotiated during TLS handshake via ALPN:
%% In hackney_ssl.erl
alpn_opts(Opts) ->
Protocols = proplists:get_value(protocols, Opts, [http2, http1]),
AlpnProtos = [proto_to_alpn(P) || P <- Protocols],
[{alpn_advertised_protocols, AlpnProtos}].
proto_to_alpn(http2) -> <<"h2">>;
proto_to_alpn(http1) -> <<"http/1.1">>.
%% After connection
get_negotiated_protocol(SslSocket) ->
case ssl:negotiated_protocol(SslSocket) of
{ok, <<"h2">>} -> http2;
_ -> http1
end.Connection Prewarm
After first use of a host, the pool maintains warm TCP connections ready for immediate use.
How It Works
- First request to
api.example.com:443completes - On checkin, pool marks host as "activated"
- Pool creates
prewarm_count(default 4) TCP connections - Next request gets connection immediately (no connect latency)
Configuration
%% Global default
application:set_env(hackney, prewarm_count, 4).
%% Per-pool
hackney_pool:start_pool(mypool, [{prewarm_count, 8}]).
%% Explicit prewarm
hackney_pool:prewarm(default, "api.example.com", 443, 10).Prewarm for HTTPS
When an SSL connection is checked in (and closed), the pool still triggers TCP prewarm. This ensures TCP connections are ready for the next HTTPS request to upgrade.
Request Flow
Complete flow for an HTTPS request with pooling:
hackney:get("https://api.example.com/data")
│
▼
┌─────────────────────────────────────┐
│ 1. Load Regulation │
│ acquire("api.example.com", 443, │
│ MaxPerHost, Timeout) │
│ → Blocks if at limit │
└─────────────────┬───────────────────┘
│ ok
▼
┌─────────────────────────────────────┐
│ 2. Pool Checkout │
│ checkout(Host, 443, hackney_tcp) │
│ → Returns Pid or none │
└─────────────────┬───────────────────┘
│
┌─────────┴─────────┐
│ │
{ok, Pid} none
│ │
│ ▼
│ ┌─────────────────┐
│ │ Create new conn │
│ │ hackney_conn_sup│
│ └────────┬────────┘
│ │
└────────┬─────────┘
│
▼
┌─────────────────────────────────────┐
│ 3. SSL Upgrade │
│ upgrade_to_ssl(Pid, SslOpts) │
│ → TCP socket becomes SSL │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 4. HTTP Request │
│ send_request(Pid, Method, ...) │
│ recv_response(Pid) │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 5. Checkin (async) │
│ Connection closed (SSL) │
│ TCP prewarm triggered │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 6. Load Regulation Release │
│ release("api.example.com", 443) │
│ → Slot available for next req │
└─────────────────────────────────────┘Monitoring and Stats
Pool Stats
hackney_pool:get_stats(PoolName).
%% Returns:
%% [{name, PoolName},
%% {max, MaxConnections},
%% {in_use_count, InUse},
%% {free_count, Free},
%% {queue_count, 0}] %% Always 0, load regulation handles queuingPer-Host Stats
hackney_pool:host_stats(PoolName, Host, Port).
%% Returns:
%% [{active, N}, %% Currently in use (from load_regulation)
%% {in_use, N}, %% Checked out from pool
%% {free, N}] %% Available in poolLoad Regulation Stats
hackney_load_regulation:current(Host, Port).
%% Returns: integer() - current concurrent connections to hostAdvantages of This Architecture
vs. hackney 1.x
| Aspect | 1.x | 2.x |
|---|---|---|
| State storage | ETS tables | Process state |
| Socket ownership | Transferred between processes | Always connection process |
| Error cleanup | Manual via manager | Automatic via process exit |
| SSL pooling | Yes (security risk) | No (TCP only) |
| Connection limits | Global pool size | Per-host limits |
| Prewarm | No | Yes |
vs. Other HTTP Clients
Process isolation: Each connection is independent. A slow response on one connection doesn't block others. A crash in one connection doesn't affect others.
Backpressure: Load regulation naturally applies backpressure when a server is overwhelmed. Requests wait rather than creating unbounded connections.
Resource control: Per-host limits prevent a single slow host from consuming all connections. Different hosts are isolated.
SSL security: SSL connections are never reused, preventing session confusion attacks and ensuring fresh handshakes.
Prewarm efficiency: Frequently-used hosts have warm connections ready, eliminating connection latency for subsequent requests.
Configuration Reference
Pool Options
| Option | Default | Description |
|---|---|---|
pool_size / max_connections | 50 | Max connections in pool |
timeout / keepalive_timeout | 2000 | Idle timeout (max 2000ms) |
prewarm_count | 4 | Connections to maintain per host |
Request Options
| Option | Default | Description |
|---|---|---|
pool | default | Pool name, or false for no pooling |
max_per_host | 50 | Max concurrent connections to host |
checkout_timeout | 8000 | Timeout to acquire connection slot |
connect_timeout | 8000 | TCP connect timeout |
recv_timeout | 5000 | Response receive timeout |
Application Environment
%% In sys.config or application:set_env
{hackney, [
{pool_handler, hackney_pool},
{max_connections, 50},
{timeout, 2000},
{prewarm_count, 4}
]}.