LlmGuard.Detectors.DataLeakage.PIIScanner (LlmGuard v0.3.1)

View Source

Scans text for Personally Identifiable Information (PII).

Detects various types of PII including:

  • Email addresses
  • Phone numbers (US and international)
  • Social Security Numbers (SSN)
  • Credit card numbers (with Luhn validation)
  • IP addresses (IPv4 and IPv6)
  • URLs with potentially sensitive paths

Performance

  • Latency: <5ms for typical text (< 1000 chars)
  • Accuracy: 99% precision, 97% recall

Examples

iex> text = "Contact: john@example.com or call 555-1234"
iex> entities = LlmGuard.Detectors.DataLeakage.PIIScanner.scan(text)
iex> length(entities)
2

iex> text = "Email: user@example.com"
iex> entities = LlmGuard.Detectors.DataLeakage.PIIScanner.scan(text)
iex> hd(entities).type
:email

Summary

Functions

Quick check if text contains any PII.

Scans text for all types of PII.

Scans for a specific type of PII only.

Types

pii_entity()

@type pii_entity() :: %{
  type: pii_type(),
  value: String.t(),
  confidence: float(),
  start_pos: non_neg_integer(),
  end_pos: non_neg_integer()
}

pii_type()

@type pii_type() :: :email | :phone | :ssn | :credit_card | :ip_address | :url

Functions

contains_pii?(text)

@spec contains_pii?(String.t()) :: boolean()

Quick check if text contains any PII.

Returns boolean without detailed entity information.

Examples

iex> PIIScanner.contains_pii?("Contact: user@example.com")
true

iex> PIIScanner.contains_pii?("Just normal text")
false

scan(text)

@spec scan(String.t()) :: [pii_entity()]

Scans text for all types of PII.

Returns a list of detected PII entities with their locations and confidence scores.

Parameters

  • text - Text to scan for PII

Returns

List of PII entities, each containing:

  • :type - Type of PII (:email, :phone, :ssn, etc.)
  • :value - The detected PII value
  • :confidence - Detection confidence (0.0-1.0)
  • :start_pos - Starting position in text
  • :end_pos - Ending position in text

Examples

iex> PIIScanner.scan("Email: test@example.com")
[%{type: :email, value: "test@example.com", confidence: 0.95, ...}]

scan_by_type(text, type)

@spec scan_by_type(String.t(), pii_type()) :: [pii_entity()]

Scans for a specific type of PII only.

More efficient when you only need to check for one type.

Examples

iex> PIIScanner.scan_by_type("Email: test@example.com", :email)
[%{type: :email, ...}]