SnmpKit Timeout Documentation
Overview
SnmpKit uses two distinct types of timeouts to handle SNMP operations safely and efficiently:
- PDU Timeout - How long to wait for each individual SNMP packet response
- Task Timeout - Maximum time allowed for entire operations to prevent hangs
PDU Timeout (:timeout parameter)
The :timeout parameter controls how long to wait for each individual SNMP PDU (Protocol Data Unit) response from the target device.
Default Values
- GET/GETBULK operations: 10 seconds (10,000 ms)
- Walk operations: 30 seconds (30,000 ms)
- Table walk operations: 50 seconds (50,000 ms)
Behavior
- Applied to each individual SNMP packet sent to the device
- If a device doesn't respond within this time, that specific PDU times out
- For multi-PDU operations (walks), each PDU gets its own timeout
Examples
# Single PDU with 5-second timeout
SnmpKit.SNMP.get("192.168.1.1", "sysDescr.0", timeout: 5_000)
# Walk where each GETBULK PDU has 15-second timeout
SnmpKit.SNMP.walk("192.168.1.1", "ifTable", timeout: 15_000)Task Timeout (internal protection)
Task timeouts protect against operations hanging indefinitely due to bugs or network issues.
Behavior by Operation Type
GET/GETBULK Operations
- Task timeout:
PDU_timeout + 1000ms - Purpose: Quick safeguard against runaway single-PDU operations
- Example: 10s PDU timeout → 11s task timeout
Walk Operations
- Task timeout: 20 minutes (1,200,000 ms)
- Purpose: Allow large table walks while preventing infinite hangs
- Rationale: Large tables may need 100+ PDUs × 30s each = substantial time
Mixed Operations
- Task timeout: 20 minutes if any walks present, otherwise
PDU_timeout + 1000ms - Purpose: Apply appropriate timeout based on operation mix
Walk Operations: Multi-PDU Behavior
Walk operations retrieve all OIDs under a root by sending multiple GETBULK requests:
Timeline Example
Root OID: "1.3.6.1.2.1.2.2.1" (ifTable)
PDU timeout: 10 seconds
Target has 50 interfaces
PDU 1: Get OIDs 1-30 → 10s timeout → Success
PDU 2: Get OIDs 31-60 → 10s timeout → Success
PDU 3: Get OIDs 61-90 → 10s timeout → Success
...
Total time: ~50-100 seconds (varies by device response time)
Task timeout: 20 minutes (allows operation to complete)Why This Matters
- Before timeout fixes: Operations failed after ~11 seconds regardless of PDU success
- After timeout fixes: Operations can run up to 20 minutes as long as individual PDUs respond
Per-Request Timeout Override
Individual requests can override the global timeout:
Single Target Examples
# Global 10s timeout, but this device gets 30s
SnmpKit.SNMP.walk("slow-device", "ifTable", timeout: 30_000)Multi-Target Examples
# Different timeouts per device
MultiV2.walk_multi([
{"fast-switch", "ifTable", [timeout: 5_000]}, # 5s per PDU
{"slow-router", "ifTable", [timeout: 30_000]}, # 30s per PDU
{"normal-device", "ifTable"} # Uses global timeout
], timeout: 15_000) # Global: 15s per PDUCommon Timeout Scenarios
Scenario 1: Fast Local Network
# Devices respond quickly, use shorter timeouts for faster failure detection
opts = [timeout: 3_000] # 3 secondsScenario 2: Slow WAN Links
# Devices over slow links need more time
opts = [timeout: 30_000] # 30 secondsScenario 3: Large Enterprise Tables
# Large routing/interface tables need longer per-PDU timeouts
opts = [timeout: 45_000] # 45 seconds per PDU
# Task timeout automatically allows up to 20 minutes totalScenario 4: Mixed Network Performance
# Different devices have different performance characteristics
MultiV2.walk_multi([
{"core-switch-1", "ifTable", [timeout: 10_000]}, # Fast core
{"wan-router-1", "bgpRouteTable", [timeout: 60_000]}, # Slow WAN + large table
{"access-switch-1", "ifTable", [timeout: 5_000]} # Fast access
], timeout: 15_000) # Default for devices without specific timeoutTimeout Error Types
PDU Timeout Errors
{:error, :timeout} # Individual PDU timed outTask Timeout Errors (should be rare after fixes)
{:error, {:task_failed, :timeout}} # Task killed by internal timeoutNetwork Errors
{:error, {:network_error, :hostname_resolution_failed}}
{:error, {:network_error, :connection_refused}}Troubleshooting Timeout Issues
Problem: Operations failing with :timeout
Solution: Increase PDU timeout for slow devices/networks
# Instead of default 10s
SnmpKit.SNMP.walk(target, oid, timeout: 30_000)Problem: Large table walks failing
Check: Are you getting {:task_failed, :timeout} or :timeout?
{:task_failed, :timeout}: Internal task timeout (should be fixed):timeout: Individual PDU timeout (increase PDU timeout)
Problem: Mixed performance in multi-target operations
Solution: Use per-request timeout overrides
MultiV2.walk_multi([
{"fast-device", oid, [timeout: 5_000]},
{"slow-device", oid, [timeout: 30_000]}
])API Reference Summary
Core Functions
SnmpKit.SNMP.get(target, oid, opts)-:timeout= PDU timeoutSnmpKit.SNMP.walk(target, oid, opts)-:timeout= per-PDU timeoutSnmpKit.SNMP.get_bulk(target, oid, opts)-:timeout= PDU timeout
Multi-Target Functions
MultiV2.get_multi(targets, opts)- Global PDU timeoutMultiV2.walk_multi(targets, opts)- Global per-PDU timeout- Per-request:
{target, oid, [timeout: ms]}- Override for specific request
Timeout Parameters
- Global:
timeout: millisecondsin function options - Per-request:
timeout: millisecondsin individual request options - Validation: Non-positive values fall back to global timeout
- Task protection: Automatic, no user configuration needed
Migration Notes
From Older Versions
If upgrading from versions with timeout issues:
- ✅ Existing timeout parameters work unchanged
- ✅ Walk operations no longer fail prematurely
- ✅ Per-request timeouts now work as documented
- ℹ️ No API changes required
Best Practices
- Start with defaults - Work for most scenarios
- Measure actual response times - Set timeouts based on real network behavior
- Use per-request overrides - For mixed-performance environments
- Monitor for timeout errors - Adjust timeouts based on operational data