LlmGuard.Detectors.DataLeakage.PIIScanner (LlmGuard v0.3.1)
View SourceScans text for Personally Identifiable Information (PII).
Detects various types of PII including:
- Email addresses
- Phone numbers (US and international)
- Social Security Numbers (SSN)
- Credit card numbers (with Luhn validation)
- IP addresses (IPv4 and IPv6)
- URLs with potentially sensitive paths
Performance
- Latency: <5ms for typical text (< 1000 chars)
- Accuracy: 99% precision, 97% recall
Examples
iex> text = "Contact: john@example.com or call 555-1234"
iex> entities = LlmGuard.Detectors.DataLeakage.PIIScanner.scan(text)
iex> length(entities)
2
iex> text = "Email: user@example.com"
iex> entities = LlmGuard.Detectors.DataLeakage.PIIScanner.scan(text)
iex> hd(entities).type
:email
Summary
Functions
Quick check if text contains any PII.
Scans text for all types of PII.
Scans for a specific type of PII only.
Types
@type pii_entity() :: %{ type: pii_type(), value: String.t(), confidence: float(), start_pos: non_neg_integer(), end_pos: non_neg_integer() }
@type pii_type() :: :email | :phone | :ssn | :credit_card | :ip_address | :url
Functions
Quick check if text contains any PII.
Returns boolean without detailed entity information.
Examples
iex> PIIScanner.contains_pii?("Contact: user@example.com")
true
iex> PIIScanner.contains_pii?("Just normal text")
false
@spec scan(String.t()) :: [pii_entity()]
Scans text for all types of PII.
Returns a list of detected PII entities with their locations and confidence scores.
Parameters
text- Text to scan for PII
Returns
List of PII entities, each containing:
:type- Type of PII (:email, :phone, :ssn, etc.):value- The detected PII value:confidence- Detection confidence (0.0-1.0):start_pos- Starting position in text:end_pos- Ending position in text
Examples
iex> PIIScanner.scan("Email: test@example.com")
[%{type: :email, value: "test@example.com", confidence: 0.95, ...}]
@spec scan_by_type(String.t(), pii_type()) :: [pii_entity()]
Scans for a specific type of PII only.
More efficient when you only need to check for one type.
Examples
iex> PIIScanner.scan_by_type("Email: test@example.com", :email)
[%{type: :email, ...}]