PcapFileEx Usage Rules for LLMs
View SourceThis guide helps AI coding assistants generate correct, performant PcapFileEx code.
Critical Decision Trees
1. Always Use Auto-Detection
✅ ALWAYS use these (auto-detect PCAP/PCAPNG):
❌ AVOID unless you're CERTAIN of file format:
PcapFileEx.Pcap.open/1(PCAP only - fails on PCAPNG)PcapFileEx.PcapNg.open/1(PCAPNG only - fails on PCAP)
Why: File extensions lie. A .pcap file might be PCAPNG format. Auto-detection prevents "wrong magic number" errors.
2. Choose the Right Access Pattern
# Small files (<100MB) - load all into memory
{:ok, packets} = PcapFileEx.read_all("small.pcap")
# Large files (>100MB) - stream lazily
PcapFileEx.stream!("large.pcap")
|> Stream.filter(fn p -> :http in p.protocols end)
|> Enum.take(100)
# Need pre-filtering (10-100x faster for selective queries)
{:ok, reader} = PcapFileEx.open("huge.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
PreFilter.protocol("tcp"),
PreFilter.port_dest(80)
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.take(10)
PcapFileEx.Pcap.close(reader)3. Resource Management
✅ Automatic (recommended) - no manual cleanup:
PcapFileEx.stream!("file.pcap") |> Enum.to_list()✅ Manual - MUST close when done:
{:ok, reader} = PcapFileEx.open("file.pcap")
try do
# Use reader
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.to_list()
after
PcapFileEx.Pcap.close(reader) # or PcapNg.close(reader)
end4. Filtering Strategy Selection
| File Size | Query Type | Use This | Why |
|---|---|---|---|
| >100MB | Simple (IP/port/protocol) | PreFilter | 10-100x faster (Rust-side) |
| Any | Complex application logic | Filter/DisplayFilter | Flexible (Elixir-side) |
| Any | Wireshark-style expressions | DisplayFilter | Familiar syntax |
5. Writer Format Selection
| Input Format | Output Needed | Use This | Why |
|---|---|---|---|
| Any | Auto-detect | PcapFileEx.write/3 | Detects from extension (.pcap vs .pcapng) |
| PCAP | PCAP | PcapFileEx.PcapWriter | Direct PCAP → PCAP (fastest) |
| PCAPNG | PCAPNG | PcapFileEx.PcapNgWriter | Preserves interfaces |
| Any | Specific format | PcapFileEx.copy/3 with format: | Explicit conversion |
| Small (<1000 pkts) | Any | write!/3 batch | Simplest API |
| Large (>1GB) | Any | Streaming writer | O(1) memory, manual open/write/close |
Common Mistakes to Avoid
❌ Mistake 1: Wrong Format Detection
# DON'T: Assume format from extension
{:ok, reader} = PcapFileEx.Pcap.open("capture.pcap")
# Fails with "wrong magic number" if file is actually PCAPNG!
# DO: Use auto-detection
{:ok, reader} = PcapFileEx.open("capture.pcap")❌ Mistake 2: Forgetting to Close Readers
# DON'T: Open without closing (resource leak!)
{:ok, reader} = PcapFileEx.open("file.pcap")
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.to_list()
# Reader never closed!
# DO: Use streaming (auto-closes)
packets = PcapFileEx.stream!("file.pcap") |> Enum.to_list()
# OR: Explicitly close
{:ok, reader} = PcapFileEx.open("file.pcap")
try do
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.to_list()
after
PcapFileEx.Pcap.close(reader)
end❌ Mistake 3: Using Slow Filtering on Large Files
# DON'T: Filter 10GB file in Elixir (VERY SLOW!)
PcapFileEx.stream!("huge_10gb.pcap")
|> Stream.filter(fn p -> :tcp in p.protocols and p.dst.port == 80 end)
|> Enum.take(10)
# DO: Use PreFilter (10-100x faster - Rust-side filtering)
{:ok, reader} = PcapFileEx.open("huge_10gb.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
PreFilter.protocol("tcp"),
PreFilter.port_dest(80)
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.take(10)
PcapFileEx.Pcap.close(reader)❌ Mistake 4: Double-Decoding HTTP Bodies
# DON'T: Manually decode already-decoded body
http = PcapFileEx.Packet.decode_http!(packet)
data = Jason.decode!(http.body) # http.decoded_body already has this!
# DO: Use automatic decoding
http = PcapFileEx.Packet.decode_http!(packet)
IO.inspect(http.decoded_body) # Already parsed JSON/ETF/form data❌ Mistake 5: Loading Huge Files Into Memory
# DON'T: Load 10GB file into memory
{:ok, packets} = PcapFileEx.read_all("huge_10gb.pcap")
stats = PcapFileEx.Stats.compute_from_packets(packets)
# DO: Stream with constant memory
{:ok, stats} = PcapFileEx.Stats.compute_streaming("huge_10gb.pcap")❌ Mistake 6: Accessing PCAPNG Fields on PCAP Files
# DON'T: Assume PCAPNG-specific fields exist
packet.interface_id # nil for PCAP files!
packet.interface # nil for PCAP files!
# DO: Check format or guard
if packet.interface do
IO.puts("Interface: #{packet.interface.name}")
end❌ Mistake 7: Disabling Decode in Wrong Place
# DON'T: Try to disable decoding per-packet
packet = PcapFileEx.Pcap.next_packet(reader, decode: false) # NO SUCH OPTION!
# DO: Disable at stream/read_all level
packets = PcapFileEx.stream!("file.pcap", decode: false) |> Enum.to_list()
{:ok, packets} = PcapFileEx.read_all("file.pcap", decode: false)❌ Mistake 8: Wrong Writer for Format Conversion
# DON'T: Use format-specific writer for conversion (complex and error-prone)
{:ok, packets} = PcapFileEx.read_all("input.pcap")
{:ok, header} = PcapFileEx.get_header("input.pcap")
# Then manually create PCAPNG interfaces, assign interface_id, etc... (lots of code!)
# DO: Use copy/3 for format conversion (simple and handles all details)
PcapFileEx.copy("input.pcap", "output.pcapng", format: :pcapng)❌ Mistake 9: Loading Huge Files for Filtering
# DON'T: Load entire 10GB file into memory to filter (OOM crash!)
{:ok, all_packets} = PcapFileEx.read_all("huge_10gb.pcap")
filtered = Enum.filter(all_packets, fn p -> :http in p.protocols end)
{:ok, header} = PcapFileEx.get_header("huge_10gb.pcap")
PcapFileEx.write!("filtered.pcap", header, filtered)
# DO: Use export_filtered (streaming - constant memory)
PcapFileEx.export_filtered!(
"huge_10gb.pcap",
"filtered.pcap",
fn p -> :http in p.protocols end
)❌ Mistake 10: Forgetting interface_id for PCAPNG
# DON'T: Write PCAP packets directly to PCAPNG (missing interface_id!)
{:ok, packets} = PcapFileEx.read_all("input.pcap")
# Packets have interface_id == nil, but PCAPNG writer requires it!
PcapFileEx.PcapNgWriter.write_all("output.pcapng", interfaces, packets) # FAILS!
# DO: Use high-level API that handles conversion automatically
PcapFileEx.copy("input.pcap", "output.pcapng", format: :pcapng)
# OR: Manually assign interface_id if using low-level API
packets_with_interface = Enum.map(packets, &%{&1 | interface_id: 0})
PcapFileEx.PcapNgWriter.write_all("output.pcapng", interfaces, packets_with_interface)Essential Patterns
Pattern 1: Basic File Reading
# Auto-detect and read all (small files)
{:ok, packets} = PcapFileEx.read_all("capture.pcap")
# Stream for large files
PcapFileEx.stream!("large.pcap")
|> Stream.filter(fn packet -> byte_size(packet.data) > 1000 end)
|> Enum.take(100)
# Manual control with explicit close
{:ok, reader} = PcapFileEx.open("capture.pcap")
{:ok, header} = PcapFileEx.Pcap.get_header(reader)
{:ok, packet} = PcapFileEx.Pcap.next_packet(reader)
PcapFileEx.Pcap.close(reader)Pattern 2: Performance Optimization with PreFilter
Use PreFilter when:
- File is large (>100MB)
- You need only a small subset of packets
- Filter criteria are simple (IP, port, protocol)
# Find first 10 HTTPS packets in 10GB file
{:ok, reader} = PcapFileEx.open("huge.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
PreFilter.protocol("tcp"),
PreFilter.port_dest(443)
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.take(10)
PcapFileEx.Pcap.close(reader)
# Multiple criteria with OR
:ok = PcapFileEx.Pcap.set_filter(reader, [
PreFilter.protocol("tcp"),
PreFilter.any([
PreFilter.port_dest(80),
PreFilter.port_dest(443),
PreFilter.port_dest(8080)
])
])
# IP range filtering
:ok = PcapFileEx.Pcap.set_filter(reader, [
PreFilter.ip_source_cidr("192.168.0.0/16")
])Pattern 3: Elixir-Side Filtering
Use Filter when:
- File is small (<100MB)
- Need complex application logic
- Need to check decoded payloads
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_protocol(:http)
|> PcapFileEx.Filter.by_size(100..1500)
|> PcapFileEx.Filter.by_time_range(start_time, end_time)
|> PcapFileEx.Filter.matching(fn p ->
# Custom logic
:http in p.protocols and String.contains?(p.decoded[:http].path || "", "/api/")
end)
|> Enum.to_list()Pattern 4: DisplayFilter (Wireshark-Style)
# Compile once, reuse multiple times
{:ok, filter} = PcapFileEx.DisplayFilter.compile("tcp.dstport == 80 && ip.src == 192.168.1.1")
packets = PcapFileEx.stream!("file.pcap")
|> PcapFileEx.DisplayFilter.run(filter)
|> Enum.to_list()
# Or inline (compiles on each use)
PcapFileEx.stream!("file.pcap")
|> PcapFileEx.DisplayFilter.filter("http.request.method == \"GET\"")
|> Enum.to_list()Pattern 5: HTTP Decoding
# Packets are automatically decoded (decode: true is default)
{:ok, packets} = PcapFileEx.read_all("capture.pcap")
packet = hd(packets)
# Check what protocols were detected
packet.protocols # [:ether, :ipv4, :tcp, :http]
# Access decoded HTTP
if :http in packet.protocols do
http = packet.decoded[:http] # or use decode_http!/1
IO.inspect(http.method)
IO.inspect(http.decoded_body) # Auto-decoded JSON/ETF/form
end
# TCP reassembly for fragmented HTTP
PcapFileEx.TCP.stream_http_messages("capture.pcapng", types: [:request])
|> Enum.each(fn msg ->
IO.puts("#{msg.http.method} #{msg.http.path}")
IO.inspect(msg.http.decoded_body)
end)Pattern 6: Statistics Computation
# Small files - eager computation
{:ok, stats} = PcapFileEx.Stats.compute("small.pcap")
IO.puts("Total packets: #{stats.total_packets}")
IO.puts("Total bytes: #{stats.total_bytes}")
IO.inspect(stats.protocols) # %{tcp: 100, udp: 50, ...}
# Large files - streaming (constant memory)
{:ok, stats} = PcapFileEx.Stats.compute_streaming("huge.pcap")
# With filtering
tcp_stats = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_protocol(:tcp)
|> PcapFileEx.Stats.compute_from_stream()Pattern 7: Hosts Mapping
Map IP addresses to human-readable hostnames across all API entry points:
# Define hosts mapping (IP string => hostname string)
hosts = %{
"172.25.0.4" => "api-gateway",
"172.65.251.78" => "client-service",
"10.0.0.1" => "database"
}
# Apply to stream/read_all
{:ok, packets} = PcapFileEx.read_all("capture.pcap", hosts_map: hosts)
{:ok, stream} = PcapFileEx.stream("capture.pcap", hosts_map: hosts)
# Apply to HTTP/2 analysis
{:ok, complete, _} = PcapFileEx.HTTP2.analyze("capture.pcap", hosts_map: hosts)
# Endpoints now show hostnames when available
packet = hd(packets)
IO.puts("#{packet.src}") # "client-service:39604"
IO.puts("#{packet.dst}") # "api-gateway:9091"
# Use Endpoint struct directly
alias PcapFileEx.Endpoint
endpoint = Endpoint.new("172.25.0.4", 9091)
endpoint = Endpoint.with_hosts(endpoint, hosts)
IO.puts("#{endpoint}") # "api-gateway:9091"
# Create from IP tuple (useful for HTTP/2)
endpoint = Endpoint.from_tuple({{172, 25, 0, 4}, 9091}, hosts)Key points:
- Hosts map uses IP strings as keys (not tuples)
- Uses
:inet.ntoa/1for consistent IP formatting - All entry points support
:hosts_mapoption - Falls back to IP when hostname not in map
Pattern 8: Raw Packet Processing (No Decoding)
When to disable decoding:
- Only need packet counts or sizes
- Only need raw bytes
- Maximum performance
# Count packets without decoding overhead
packet_count = PcapFileEx.stream!("large.pcap", decode: false)
|> Enum.count()
# Sum packet sizes
total_bytes = PcapFileEx.stream!("large.pcap", decode: false)
|> Stream.map(&byte_size(&1.data))
|> Enum.sum()
# Find largest packet
largest = PcapFileEx.stream!("large.pcap", decode: false)
|> Enum.max_by(&byte_size(&1.data))Protocol Detection vs Decoding
Important distinction:
packet.protocols- List of detected protocols (auto-populated)packet.protocol- Highest layer protocol detectedpacket.decoded- Map of decoded application payloads (auto-populated if decode: true)
packet.protocols # [:ether, :ipv4, :tcp, :http]
packet.protocol # :http
packet.decoded # %{http: %PcapFileEx.HTTP{...}}
# Check before accessing
if :http in packet.protocols do
http = packet.decoded[:http]
# or use helper
http = PcapFileEx.Packet.decode_http!(packet)
endCustom Protocol Decoders
Register custom application-layer protocol decoders using DecoderRegistry (v0.5.0+):
# Register a custom decoder with context passing (new API)
PcapFileEx.DecoderRegistry.register(%{
protocol: :my_protocol,
matcher: fn layers, payload ->
if my_protocol?(layers) do
# Return context to decoder (thread-safe, efficient)
{:match, extract_context(layers)}
else
false
end
end,
decoder: fn context, payload ->
# Use context from matcher (no double-decode!)
{:ok, decode_with_context(payload, context)}
end,
fields: [...]
})
# Now packets are automatically decoded
packet = PcapFileEx.Packet.decode_registered(packet)
# => {:ok, {:my_protocol, decoded_data}}See the PcapFileEx.DecoderRegistry module documentation for complete patterns.
When to Use Each Module
PcapFileEx (Main API - Use This!)
- Auto-detects PCAP vs PCAPNG format
open/1,read_all/1,stream/1- Use this unless you have specific reason not to
PcapFileEx.Pcap
- PCAP-specific operations
- Only use if you're certain file is PCAP format
- Supports both microsecond and nanosecond precision
PcapFileEx.PcapNg
- PCAPNG-specific operations
- Only use if you're certain file is PCAPNG format
- Provides interface metadata
PcapFileEx.Stream
- Lazy streaming with automatic resource cleanup
stream/1for direct file accessfrom_reader/1for manual reader (must close!)
PcapFileEx.Filter
- Elixir-side filtering (post-decode)
- Flexible, supports complex logic
- Use for small files or complex queries
PcapFileEx.PreFilter
- Rust-side filtering (pre-decode)
- 10-100x faster than Filter
- Use for large files with simple criteria
PcapFileEx.DisplayFilter
- Wireshark-style filter expressions
- Familiar syntax for network engineers
- Supports field-based queries
PcapFileEx.Stats
- Statistics computation
compute/1for small filescompute_streaming/1for large files
PcapFileEx.HTTP
- HTTP message decoding
- Automatic body parsing (JSON/ETF/form)
- Request/response extraction
PcapFileEx.TCP
- TCP stream reassembly
stream_http_messages/2for fragmented HTTP- Handles out-of-order packets
PcapFileEx.HTTP2
- HTTP/2 cleartext (h2c) stream reconstruction
analyze/2for PCAP file analysis with options::port- Filter to specific TCP port:decode_content- Auto-decode bodies based on Content-Type (default: true)
- Returns complete and incomplete exchanges with
decoded_bodyfield - Automatic body decoding: JSON, text, multipart/*, binary fallback
- Supports mid-connection captures (with limitations)
- Cleartext only - no TLS/h2 support
- See
PcapFileEx.HTTP2module documentation for content decoding patterns
PcapFileEx.DecoderRegistry
- Register custom application-layer protocol decoders
- Extend protocol support beyond built-in HTTP
- Use new context-passing API (v0.5.0+) for thread-safety and performance
- Matchers return
{:match, context}instead of booleans - Decoders receive
(context, payload)for clean data flow - See
PcapFileEx.DecoderRegistrymodule documentation for complete guide
Security Considerations
ETF Decoding
When HTTP body contains Erlang Term Format (ETF):
# Auto-decoded with :safe flag (prevents code execution)
http = PcapFileEx.Packet.decode_http!(packet)
http.decoded_body # Safe ETF decode or nil if invalidNEVER manually decode ETF from untrusted sources without :safe flag!
Input Validation
Always validate packets from untrusted sources:
# Check file validity before processing
case PcapFileEx.Validator.validate_file("untrusted.pcap") do
{:ok, :pcap} -> # Safe to process
{:ok, :pcapng} -> # Safe to process
{:error, reason} -> # Invalid file
endPerformance Guidelines
File Size Thresholds
- < 10MB: Use
read_all/1(fastest, loads all into memory) - 10-100MB: Use
stream/1(balanced) - > 100MB: Use
stream/1+ PreFilter if selective - > 1GB: Always use streaming + PreFilter
PreFilter Performance
PreFilter is 10-100x faster than Elixir-side filtering for simple criteria:
# Benchmark: Finding 10 packets in 10GB file
# Elixir Filter: ~120 seconds
# PreFilter: ~1.2 seconds (100x faster!)Use PreFilter when:
- ✅ File > 100MB
- ✅ Need small subset of packets
- ✅ Criteria are simple (IP/port/protocol)
Use Elixir Filter when:
- ✅ File < 100MB
- ✅ Need complex application logic
- ✅ Need to check decoded payloads
Timestamp Precision (v0.2.0+)
Understanding Timestamp Fields
Each packet has two timestamp fields:
timestamp(DateTime) - Microsecond precision (6 decimal places)- Use for: Display, logging, general time queries
- Backward compatible with existing code
timestamp_precise(Timestamp) - Nanosecond precision (9 decimal places)- Use for: Sorting, merging multiple files, precise timing analysis
- Required for nanosecond-resolution PCAP files (common on Linux)
Common Use Cases
✅ Merging Packets from Multiple Files
# Merge packets from multiple captures in chronological order
# Using PcapFileEx.Merge (v0.3.0+) - memory-efficient streaming merge
files = ["capture1.pcapng", "capture2.pcapng", "capture3.pcapng"]
# Memory-efficient streaming merge (O(N files) memory)
{:ok, stream} = PcapFileEx.Merge.stream(files)
packets = Enum.to_list(stream)
# With source tracking to identify packet origins
{:ok, stream} = PcapFileEx.Merge.stream(files, annotate_source: true)
packets = Enum.take(stream, 100) # Each item is {packet, metadata}
# Legacy approach (loads all files into memory - not recommended for large files)
all_packets =
files
|> Enum.flat_map(fn file ->
{:ok, packets} = PcapFileEx.read_all(file)
packets
end)
|> Enum.sort_by(& &1.timestamp_precise, PcapFileEx.Timestamp)
# See usage-rules/merging.md for complete merge patterns and best practices✅ Calculating Precise Time Differences
{:ok, packets} = PcapFileEx.read_all("capture.pcapng")
[first, second | _] = packets
# Get difference in nanoseconds
diff_ns = PcapFileEx.Timestamp.diff(second.timestamp_precise, first.timestamp_precise)
IO.puts("Time between packets: #{diff_ns} nanoseconds")
# Convert to other units
diff_us = div(diff_ns, 1000) # microseconds
diff_ms = div(diff_ns, 1_000_000) # milliseconds✅ Filtering by Precise Time Range
# Find packets within a specific nanosecond-precision window
start_ts = PcapFileEx.Timestamp.new(1731065049, 735000000)
end_ts = PcapFileEx.Timestamp.new(1731065049, 736000000)
packets_in_window =
PcapFileEx.stream!("capture.pcapng")
|> Stream.filter(fn p ->
PcapFileEx.Timestamp.compare(p.timestamp_precise, start_ts) != :lt and
PcapFileEx.Timestamp.compare(p.timestamp_precise, end_ts) != :gt
end)
|> Enum.to_list()When to Use Which Field
| Use Case | Use timestamp | Use timestamp_precise |
|---|---|---|
| Display to users | ✅ | ❌ |
| Simple time filters | ✅ | ❌ |
| Sorting packets | ❌ | ✅ |
| Merging files | ❌ | ✅ |
| Sub-microsecond timing | ❌ | ✅ |
| Nanosecond analysis | ❌ | ✅ |
Timestamp API Reference
alias PcapFileEx.Timestamp
# Create timestamp
ts = Timestamp.new(secs, nanos)
# Convert to total nanoseconds
total_ns = Timestamp.to_unix_nanos(ts)
# Convert to DateTime (loses nanosecond precision)
dt = Timestamp.to_datetime(ts)
# Compare timestamps
Timestamp.compare(ts1, ts2) # => :lt | :eq | :gt
# Calculate difference
diff_ns = Timestamp.diff(ts1, ts2) # => integer (nanoseconds)❌ Common Mistake: Using DateTime for Sorting
# DON'T: Use DateTime for sorting (loses nanosecond precision!)
packets
|> Enum.sort_by(& &1.timestamp)
# DO: Use Timestamp for accurate sorting
packets
|> Enum.sort_by(& &1.timestamp_precise, PcapFileEx.Timestamp)Backward Compatibility
Existing code continues to work unchanged:
# All of this still works!
packet.timestamp.year # => 2024
packet.timestamp.month # => 11
DateTime.compare(packet.timestamp, some_datetime) # => :ltRelated Documentation
- Performance Guide - Detailed performance optimization
- Filtering Guide - Complete filtering reference
- HTTP Guide - HTTP/1.x decoding patterns
- HTTP/2 Guide - HTTP/2 cleartext (h2c) analysis patterns
- Traffic Flows Guide - Unified traffic flow analysis by protocol
- Format Guide - PCAP vs PCAPNG differences
- Examples - Complete working examples
PcapFileEx.HTTP2- HTTP/2 cleartext (h2c) analysisPcapFileEx.DecoderRegistry- Custom protocol decoders with context passingPcapFileEx.Merge- Multi-file chronological merge patternsPcapFileEx.PcapWriter/PcapFileEx.PcapNgWriter- Creating and exporting PCAP files