Kanta.PoFiles.Services.StaleDetection (kanta v0.5.1)
Service for detecting stale translation messages and finding potential replacements.
A message is "stale" when it exists in the database but is missing from ALL locale PO files. This service identifies these stale messages system-wide and uses fuzzy matching (Jaro distance algorithm) to suggest active messages that could replace them.
Stale Detection Strategy
Uses a system-wide approach that treats messages globally rather than per-locale:
- Extracts all message keys from PO files across all locales
- Compares database messages against this global set of active keys
- Messages not found in ANY locale's PO files are marked as stale
- Fuzzy matching finds similar active messages within the same domain/context
This global approach ensures cross-locale consistency and simplifies migration when message keys change across the entire application.
Fuzzy Matching
When stale messages are detected, the service automatically searches for similar active messages using String.jaro_distance (0.0 = no match, 1.0 = identical). Matches are scoped by domain and context for relevance, with a default threshold of 0.8 for determining viable replacements.
Usage
# Detect stale messages with default settings
StaleDetection.call()
# Use custom PO file path and threshold
StaleDetection.call(base_path: "/path/to/gettext", fuzzy_threshold: 0.9)Returns
Returns {:ok, %Result{}} containing:
stale_message_ids- MapSet of stale message IDsfuzzy_matches_map- Map of stale_message_id => %FuzzyMatch{}stale_count- Total number of stale messagesmergeable_count- Number of stale messages with fuzzy matches above threshold
Each FuzzyMatch struct contains:
stale_message_id- ID of the stale messagematched_message_id- ID of the active message that matchesmatched_msgid- The msgid string of the matched messagesimilarity_score- Jaro distance score (0.0-1.0)
Summary
Functions
Identifies stale translation messages system-wide.
Functions
Identifies stale translation messages system-wide.
A message is considered "stale" if it doesn't exist in ANY locale's PO files.
Options
:base_path- Base directory to search for PO files (optional):fuzzy_threshold- Similarity threshold 0.0-1.0 (default: 0.8)
Returns
{:ok, %Result{}} - A struct containing:
:stale_message_ids- MapSet of message IDs that are stale:fuzzy_matches_map- Map of stale_message_id => %FuzzyMatch{}:stale_count- Total number of stale messages:mergeable_count- Number of stale messages with fuzzy matches above threshold