Complete Filtering Guide

View Source

PcapFileEx provides three different filtering systems. This guide explains when and how to use each one.

Filtering Systems Overview

Filter TypeWherePerformanceFlexibilityBest For
PreFilterRust-side (pre-decode)⚡⚡⚡ Fastest (10-100x)Simple criteriaLarge files, selective queries
FilterElixir-side (post-decode)⚡ StandardVery flexibleComplex logic, small files
DisplayFilterElixir-side (post-decode)⚡ StandardWireshark-styleFamiliar syntax

Decision Tree: Which Filter to Use?

Is file > 100MB?
 YES: Is query selective (<10% of packets)?
    YES: Is criteria simple (IP/port/protocol)?
       YES: Use PreFilter 
       NO: Use Filter/DisplayFilter 
    NO: Use Filter/DisplayFilter 
 NO: Is syntax important?
     Wireshark-style preferred: Use DisplayFilter
     Function-based preferred: Use Filter
     Simple criteria: Use PreFilter (small benefit)

PreFilter (Rust-Side Filtering)

Overview

  • Location: Rust native code
  • Timing: Before packet decode
  • Performance: 10-100x faster than Elixir filtering
  • Limitation: Only simple criteria (IP, port, protocol)

When to Use PreFilter

Use PreFilter when:

  • File is large (>100MB)
  • You need small subset of packets (<10%)
  • Criteria are simple (IP, port, protocol)
  • Early termination (take/find)

Don't use PreFilter when:

  • File is small (<10MB) - overhead not worth it
  • Need most packets (>50%)
  • Need complex application logic
  • Need to check decoded payloads

Available PreFilter Functions

Protocol Filtering

# Single protocol
PreFilter.protocol("tcp")
PreFilter.protocol("udp")
PreFilter.protocol("icmp")
PreFilter.protocol("http")

# Multiple protocols (OR)
PreFilter.any([
  PreFilter.protocol("tcp"),
  PreFilter.protocol("udp")
])

Port Filtering

# Destination port
PreFilter.port_dest(80)
PreFilter.port_dest(443)

# Source port
PreFilter.port_source(8080)

# Either source or destination
PreFilter.port(443)

# Multiple ports (OR)
PreFilter.any([
  PreFilter.port_dest(80),
  PreFilter.port_dest(443),
  PreFilter.port_dest(8080)
])

IP Address Filtering

# Source IP (exact)
PreFilter.ip_source("192.168.1.1")

# Destination IP (exact)
PreFilter.ip_dest("10.0.0.1")

# Either source or destination
PreFilter.ip("192.168.1.1")

# CIDR range
PreFilter.ip_source_cidr("192.168.0.0/16")
PreFilter.ip_dest_cidr("10.0.0.0/8")

Combining Filters

# AND semantics (all must match)
PreFilter.all([
  PreFilter.protocol("tcp"),
  PreFilter.port_dest(80)
])
# Packet must be TCP AND destination port 80

# OR semantics (any can match)
PreFilter.any([
  PreFilter.port_dest(80),
  PreFilter.port_dest(443)
])
# Packet can have destination port 80 OR 443

# Nested combinations
PreFilter.all([
  PreFilter.protocol("tcp"),
  PreFilter.any([
    PreFilter.port_dest(80),
    PreFilter.port_dest(443),
    PreFilter.port_dest(8080)
  ])
])
# TCP packets to ports 80, 443, or 8080

PreFilter Examples

# Example 1: Find HTTPS traffic
{:ok, reader} = PcapFileEx.open("capture.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
  PreFilter.protocol("tcp"),
  PreFilter.port_dest(443)
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.take(100)
PcapFileEx.Pcap.close(reader)

# Example 2: Internal network traffic
{:ok, reader} = PcapFileEx.open("capture.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
  PreFilter.ip_source_cidr("10.0.0.0/8")
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.to_list()
PcapFileEx.Pcap.close(reader)

# Example 3: Web traffic (HTTP or HTTPS)
{:ok, reader} = PcapFileEx.open("capture.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
  PreFilter.protocol("tcp"),
  PreFilter.any([
    PreFilter.port_dest(80),
    PreFilter.port_dest(443)
  ])
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.to_list()
PcapFileEx.Pcap.close(reader)

# Example 4: Clearing filter
{:ok, reader} = PcapFileEx.open("capture.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [PreFilter.protocol("tcp")])
tcp_packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.take(100)

:ok = PcapFileEx.Pcap.clear_filter(reader)  # Back to all packets
all_packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.take(100)
PcapFileEx.Pcap.close(reader)

Filter (Elixir-Side Filtering)

Overview

  • Location: Elixir code
  • Timing: After packet decode
  • Performance: Standard
  • Flexibility: Very flexible, full Elixir logic

Available Filter Functions

Protocol Filtering

# Filter by single protocol
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_protocol(:tcp)
|> Enum.to_list()

# Filter by multiple protocols
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_protocol([:tcp, :udp])
|> Enum.to_list()

Size Filtering

# Exact size
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_size(1500)
|> Enum.to_list()

# Size range
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_size(100..1500)
|> Enum.to_list()

# Minimum size
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_size(1000..)
|> Enum.to_list()

Time Range Filtering

start_time = ~U[2025-01-01 00:00:00Z]
end_time = ~U[2025-01-02 00:00:00Z]

PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_time_range(start_time, end_time)
|> Enum.to_list()

Endpoint Filtering

# By source endpoint
endpoint = %PcapFileEx.Endpoint{ip: "192.168.1.1", port: 8080}
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_source(endpoint)
|> Enum.to_list()

# By destination endpoint
endpoint = %PcapFileEx.Endpoint{ip: "10.0.0.1", port: 80}
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_destination(endpoint)
|> Enum.to_list()

# By either source or destination
endpoint = %PcapFileEx.Endpoint{ip: "192.168.1.1", port: nil}
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_endpoint(endpoint)
|> Enum.to_list()

Custom Matching

# Custom predicate function
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.matching(fn packet ->
  # Any custom logic
  :http in packet.protocols and
  byte_size(packet.data) > 1000 and
  packet.timestamp.hour >= 9 and
  packet.timestamp.hour <= 17
end)
|> Enum.to_list()

Chaining Filters

# Combine multiple filters
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_protocol(:tcp)
|> PcapFileEx.Filter.by_size(100..1500)
|> PcapFileEx.Filter.by_time_range(start_time, end_time)
|> PcapFileEx.Filter.matching(fn p ->
  p.dst.port in [80, 443, 8080]
end)
|> Enum.to_list()

Filter Examples

# Example 1: Large HTTP packets
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_protocol(:http)
|> PcapFileEx.Filter.by_size(1000..)
|> Enum.to_list()

# Example 2: Traffic to specific server during business hours
server = %PcapFileEx.Endpoint{ip: "10.0.0.1", port: nil}
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.by_destination(server)
|> PcapFileEx.Filter.matching(fn p ->
  p.timestamp.hour >= 9 and p.timestamp.hour <= 17
end)
|> Enum.to_list()

# Example 3: Complex application logic
PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.Filter.matching(fn packet ->
  cond do
    :http in packet.protocols ->
      http = packet.decoded[:http]
      http.method == "POST" and String.contains?(http.path || "", "/api/")

    :tcp in packet.protocols ->
      packet.dst.port in [80, 443, 8080]

    true ->
      false
  end
end)
|> Enum.to_list()

DisplayFilter (Wireshark-Style)

Overview

  • Location: Elixir code
  • Timing: After packet decode
  • Syntax: Wireshark-style expressions
  • Best for: Users familiar with Wireshark

Supported Operators

Comparison Operators

==    Equal
!=    Not equal
>     Greater than
<     Less than
>=    Greater than or equal
<=    Less than or equal

Logical Operators

&&    AND
||    OR
!     NOT

Field Types

String fields:   "value" or 'value'
Numeric fields:  123, 456.78
IP addresses:    192.168.1.1
Boolean:         true, false

Available Fields

IP Layer

ip.src          Source IP address
ip.dst          Destination IP address
ip.version      IP version (4 or 6)

TCP Layer

tcp.srcport     Source port
tcp.dstport     Destination port
tcp.flags.syn   SYN flag
tcp.flags.ack   ACK flag
tcp.flags.fin   FIN flag
tcp.flags.rst   RST flag

UDP Layer

udp.srcport     Source port
udp.dstport     Destination port

HTTP Layer

http.request.method      HTTP method (GET, POST, etc.)
http.request.uri         Request URI/path
http.request.version     HTTP version
http.response.code       Response status code
http.host                Host header

Packet Metadata

frame.len        Packet length (bytes)
frame.time       Packet timestamp

DisplayFilter Examples

# Example 1: Simple inline filter
packets = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.DisplayFilter.filter("tcp.dstport == 80")
|> Enum.to_list()

# Example 2: Compiled filter (reuse)
{:ok, filter} = PcapFileEx.DisplayFilter.compile("ip.src == 192.168.1.1 && tcp.dstport == 443")
packets = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.DisplayFilter.run(filter)
|> Enum.to_list()

# Example 3: HTTP GET requests
packets = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.DisplayFilter.filter("http.request.method == \"GET\"")
|> Enum.to_list()

# Example 4: Complex expression
packets = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.DisplayFilter.filter("""
  (ip.src == 192.168.1.1 || ip.dst == 192.168.1.1) &&
  (tcp.dstport == 80 || tcp.dstport == 443) &&
  frame.len > 1000
""")
|> Enum.to_list()

# Example 5: HTTP responses with errors
packets = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.DisplayFilter.filter("http.response.code >= 400")
|> Enum.to_list()

# Example 6: SYN packets
packets = PcapFileEx.stream!("capture.pcap")
|> PcapFileEx.DisplayFilter.filter("tcp.flags.syn == true && tcp.flags.ack == false")
|> Enum.to_list()

Comparing the Three Approaches

Same Query, Three Ways

Find all HTTPS traffic from 192.168.1.0/24:

Method 1: PreFilter (Fastest for large files)

{:ok, reader} = PcapFileEx.open("large.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
  PreFilter.protocol("tcp"),
  PreFilter.port_dest(443),
  PreFilter.ip_source_cidr("192.168.1.0/24")
])
packets = PcapFileEx.Stream.from_reader!(reader) |> Enum.to_list()
PcapFileEx.Pcap.close(reader)

Method 2: Filter (Most flexible)

source_endpoint = %PcapFileEx.Endpoint{ip: "192.168.1.0/24", port: nil}
packets = PcapFileEx.stream!("large.pcap")
|> PcapFileEx.Filter.by_protocol(:tcp)
|> PcapFileEx.Filter.matching(fn p ->
  p.dst.port == 443 and ip_in_cidr?(p.src.ip, "192.168.1.0/24")
end)
|> Enum.to_list()

Method 3: DisplayFilter (Wireshark syntax)

packets = PcapFileEx.stream!("large.pcap")
|> PcapFileEx.DisplayFilter.filter("""
  tcp.dstport == 443 &&
  ip.src >= 192.168.1.0 &&
  ip.src <= 192.168.1.255
""")
|> Enum.to_list()

Advanced Filtering Patterns

Pattern 1: Two-Stage Filtering

Combine PreFilter (fast) with Elixir Filter (flexible):

# Stage 1: PreFilter eliminates ~90% of packets (fast)
{:ok, reader} = PcapFileEx.open("huge.pcap")
:ok = PcapFileEx.Pcap.set_filter(reader, [
  PreFilter.protocol("tcp"),
  PreFilter.port_dest(80)
])

# Stage 2: Elixir Filter for complex logic (on remaining 10%)
packets = PcapFileEx.Stream.from_reader!(reader)
|> Stream.filter(fn p ->
  :http in p.protocols and
  p.decoded[:http].method == "POST" and
  String.contains?(p.decoded[:http].path || "", "/api/users")
end)
|> Enum.to_list()

PcapFileEx.Pcap.close(reader)

Pattern 2: Conditional Filtering

# Different filters based on packet type
packets = PcapFileEx.stream!("capture.pcap")
|> Stream.filter(fn packet ->
  cond do
    :http in packet.protocols ->
      http = packet.decoded[:http]
      http.method in ["POST", "PUT", "DELETE"]

    :dns in packet.protocols ->
      # DNS query packets
      true

    :tcp in packet.protocols ->
      packet.dst.port in [22, 3389]  # SSH or RDP

    true ->
      false
  end
end)
|> Enum.to_list()

Pattern 3: Stateful Filtering

# Track TCP connections, filter by connection state
connections = %{}

packets = PcapFileEx.stream!("capture.pcap")
|> Enum.reduce([], fn packet, acc ->
  if :tcp in packet.protocols do
    conn_key = {packet.src, packet.dst}

    # Update connection state
    # ... stateful logic ...

    # Filter based on state
    if should_include?(packet, connections[conn_key]) do
      [packet | acc]
    else
      acc
    end
  else
    acc
  end
end)
|> Enum.reverse()

Pattern 4: Sampling

# Keep every Nth packet
packets = PcapFileEx.stream!("huge.pcap")
|> Stream.with_index()
|> Stream.filter(fn {_packet, index} -> rem(index, 100) == 0 end)
|> Stream.map(fn {packet, _index} -> packet end)
|> Enum.to_list()

# Random sampling (10%)
packets = PcapFileEx.stream!("huge.pcap")
|> Stream.filter(fn _packet -> :rand.uniform() < 0.1 end)
|> Enum.to_list()

Filter Performance Comparison

Benchmark: 10GB file, 50M packets, find 100 TCP:443 packets

MethodTimeMemoryNotes
PreFilter1.2s50MBFastest, Rust-side
Filter120s50MB100x slower, Elixir-side
DisplayFilter125s50MBSimilar to Filter
Two-stage5s50MBPreFilter + complex Elixir logic

Common Filtering Mistakes

❌ Mistake 1: Wrong Filter Choice for Large Files

# DON'T: Use Elixir filter on 10GB file for simple query
PcapFileEx.stream!("10gb.pcap")
|> Stream.filter(fn p -> :tcp in p.protocols and p.dst.port == 443 end)
|> Enum.take(10)  # Takes 2 minutes!

# DO: Use PreFilter
{:ok, r} = PcapFileEx.open("10gb.pcap")
:ok = PcapFileEx.Pcap.set_filter(r, [
  PreFilter.protocol("tcp"),
  PreFilter.port_dest(443)
])
packets = PcapFileEx.Stream.from_reader(r) |> Enum.take(10)  # Takes 1 second!
PcapFileEx.Pcap.close(r)

❌ Mistake 2: Forgetting to Close Reader

# DON'T: Forget to close
{:ok, r} = PcapFileEx.open("file.pcap")
:ok = PcapFileEx.Pcap.set_filter(r, [...])
packets = PcapFileEx.Stream.from_reader(r) |> Enum.to_list()
# Missing close!

# DO: Always close
{:ok, r} = PcapFileEx.open("file.pcap")
try do
  :ok = PcapFileEx.Pcap.set_filter(r, [...])
  packets = PcapFileEx.Stream.from_reader(r) |> Enum.to_list()
after
  PcapFileEx.Pcap.close(r)
end

❌ Mistake 3: Using PreFilter for Broad Queries

# DON'T: PreFilter that matches most packets (overhead not worth it)
{:ok, r} = PcapFileEx.open("file.pcap")
:ok = PcapFileEx.Pcap.set_filter(r, [
  PreFilter.any([  # Matches 90% of packets!
    PreFilter.protocol("tcp"),
    PreFilter.protocol("udp")
  ])
])

# DO: Use Elixir filter or no filter at all
packets = PcapFileEx.stream!("file.pcap")
|> Stream.filter(fn p -> p.protocol in [:tcp, :udp] end)
|> Enum.to_list()

Summary: Filter Selection Guide

Use PreFilter when:

  • ✅ File > 100MB
  • ✅ Selective query (<10% of packets)
  • ✅ Simple criteria (IP/port/protocol)
  • ✅ Need maximum performance

Use Filter when:

  • ✅ Complex application logic
  • ✅ Need to check decoded payloads
  • ✅ Flexible predicate functions
  • ✅ File < 100MB

Use DisplayFilter when:

  • ✅ Familiar with Wireshark syntax
  • ✅ Want readable filter expressions
  • ✅ Field-based queries
  • ✅ Network engineer background