Anvil.PII.Pseudonym (Anvil v0.1.1)

View Source

Labeler pseudonymization for privacy-preserving exports.

This module generates stable pseudonyms for labelers that:

  • Are consistent within a tenant (same labeler always gets same pseudonym)
  • Are unlinkable across tenants (different pseudonym per tenant)
  • Cannot be reversed to recover the original external_id
  • Are suitable for publication in research datasets

Security Properties

  • Uses HMAC-SHA256 for cryptographically secure hashing
  • Requires a secret key configured at application level
  • Includes tenant_id in hash to prevent cross-tenant linking
  • Truncates hash to 16 characters for readability

Examples

iex> Anvil.PII.Pseudonym.generate("user123", "tenant456")
"labeler_a1b2c3d4e5f6g7h8"

iex> Anvil.PII.Pseudonym.generate("user123", "tenant456")
"labeler_a1b2c3d4e5f6g7h8"  # Same result (stable)

iex> Anvil.PII.Pseudonym.generate("user123", "tenant789")
"labeler_x9y8z7w6v5u4t3s2"  # Different result (tenant-specific)

Summary

Functions

Returns the pseudonym for a labeler, generating one if not present.

Returns the labeler identifier to use in exports.

Generates a stable pseudonym for a labeler.

Rotates the pseudonym secret and regenerates all pseudonyms.

Validates that a string is a valid pseudonym format.

Functions

ensure_pseudonym(labeler)

@spec ensure_pseudonym(Anvil.Schema.Labeler.t()) ::
  {:ok, String.t()} | {:error, term()}

Returns the pseudonym for a labeler, generating one if not present.

This is the main entry point for ensuring labelers have pseudonyms. It will update the labeler record if a pseudonym needs to be generated.

Parameters

Returns

{:ok, pseudonym} or {:error, reason}

Examples

iex> labeler = %Labeler{external_id: "user123", tenant_id: "tenant1", pseudonym: nil}
iex> Anvil.PII.Pseudonym.ensure_pseudonym(labeler)
{:ok, "labeler_a1b2c3d4e5f6g7h8"}

export_identifier(labeler)

@spec export_identifier(Anvil.Schema.Labeler.t()) ::
  {:ok, String.t()} | {:error, term()}

Returns the labeler identifier to use in exports.

Always returns the pseudonym, never the external_id or internal UUID. Generates a pseudonym if one doesn't exist.

Examples

iex> labeler = %Labeler{pseudonym: "labeler_abc123"}
iex> Anvil.PII.Pseudonym.export_identifier(labeler)
{:ok, "labeler_abc123"}

generate(external_id, tenant_id \\ "default")

@spec generate(String.t(), String.t() | nil) :: String.t()

Generates a stable pseudonym for a labeler.

Parameters

  • external_id - The labeler's external identifier (e.g., OIDC sub claim)
  • tenant_id - The tenant ID (optional, defaults to "default")

Returns

A pseudonym string in the format "labeler_XXXXXXXXXXXXXXXX" where X is a hex digit.

Examples

iex> Anvil.PII.Pseudonym.generate("user@example.com", "acme-corp")
"labeler_7a3b9f2c1e4d8a6b"

rotate_secret(new_secret)

@spec rotate_secret(String.t()) :: {:ok, non_neg_integer()} | {:error, term()}

Rotates the pseudonym secret and regenerates all pseudonyms.

WARNING: This breaks the linkage with previous exports. Only use when required for security purposes (e.g., secret compromise).

This function should be called manually and will update all labeler records.

Parameters

  • new_secret - The new secret to use for pseudonym generation

Returns

{:ok, count} where count is the number of labelers updated, or {:error, reason}

valid_format?(pseudonym)

@spec valid_format?(String.t()) :: boolean()

Validates that a string is a valid pseudonym format.

Examples

iex> Anvil.PII.Pseudonym.valid_format?("labeler_a1b2c3d4e5f6g7h8")
true

iex> Anvil.PII.Pseudonym.valid_format?("invalid")
false

iex> Anvil.PII.Pseudonym.valid_format?("labeler_short")
false