Puid.Chars (puid v2.7.0)
View SourcePre-defined character sets for use when creating Puid modules.
Example
iex> defmodule(AlphanumId, do: use(Puid, chars: :alphanum))Pre-defined Chars
:alpha
Upper/lower case alphabet
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzbits per character: 5.7
:alpha_lower
Lower case alphabet
abcdefghijklmnopqrstuvwxyzbits per character: 4.7
:alpha_upper
Upper case alphabet
ABCDEFGHIJKLMNOPQRSTUVWXYZbits per character: 4.7
:alphanum
Upper/lower case alphabet and numbers
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789bits per character: 5.95
:alphanum_lower
Lower case alphabet and numbers
abcdefghijklmnopqrstuvwxyz0123456789bits per character: 5.17
:alphanum_upper
Upper case alphabet and numbers
ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789bits per character: 5.17
:base16
RFC 4648 base16 character set
0123456789ABCDEFbits per character: 4
:base32
RFC 4648 base32 character set
ABCDEFGHIJKLMNOPQRSTUVWXYZ234567bits per character: 5
:base32_hex
RFC 4648 base32 extended hex character set with lowercase letters
0123456789abcdefghijklmnopqrstuvbits per character: 5
:base32_hex_upper
RFC 4648 base32 extended hex character set
0123456789ABCDEFGHIJKLMNOPQRSTUVbits per character: 5
:base36
Case-insensitive alphanumeric (lowercase)
0123456789abcdefghijklmnopqrstuvwxyzbits per character: 5.17
:base36_upper
Case-insensitive alphanumeric (uppercase)
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZbits per character: 5.17
:base45
QR code alphanumeric mode (ISO/IEC 18004:2015)
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ $%*+-./:bits per character: 5.49
:base58
Bitcoin Base58 alphabet (no 0, O, I, l)
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyzbits per character: 5.86
:base62
Alphanumeric characters (alias for :alphanum)
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789bits per character: 5.95
:base85
ASCII85/Ascii85 encoding (Adobe, btoa)
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstubits per character: 6.41
:bech32
Bitcoin SegWit address encoding (no 1, b, i, o)
023456789acdefghjklmnpqrstuvwxyzbits per character: 5
:boolean
Boolean/binary representation
TFbits per character: 1
:crockford32
0123456789ABCDEFGHJKMNPQRSTVWXYZ:decimal
Decimal digits
0123456789bits per character: 3.32
:dna
DNA nucleotide bases
ACGTbits per character: 2
:geohash
Geohash encoding alphabet (base32 variant excluding 'a', 'i', 'l', 'o')
0123456789bcdefghjkmnpqrstuvwxyzbits per character: 5
:hex
Lowercase hexadecimal
0123456789abcdefbits per character: 4
:hex_upper
Uppercase hexadecimal
0123456789ABCDEFbits per character: 4
:safe_ascii
ASCII characters from ?! to ?~, minus backslash, backtick, single-quote and double-quote
`!#$%&()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_bcdefghijklmnopqrstuvwxyz{|}~`bits per character: 6.49
:safe32
Strings that don't look like English words and are easier to parse visually
2346789bdfghjmnpqrtBDFGHJLMNPQRT- remove all upper and lower case vowels (including y)
 - remove all numbers that look like letters
 - remove all letters that look like numbers
 - remove all letters that have poor distinction between upper and lower case values
 
bits per character: 6.49
:safe64
RFC 4648 file system and URL safe character set
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_bits per character: 6
:symbol
:safe_ascii characters not in :alphanum
`!#$%&()*+,-./:;<=>?@[]^_{|}~`bits per character: 4.81
:url_safe
RFC 3986 unreserved characters (URL safe without percent-encoding)
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~bits per character: 6.02
:word_safe32
Strings that don't look like English words
23456789CFGHJMPQRVWXcfghjmpqrvwxOrigin unknown
bits per character: 5
:z_base32
Zooko's human-oriented base32 (easier to read/transcribe)
ybndrfg8ejkmcpqxot1uwisza345h769bits per character: 5
Summary
Types
Chars can be designated by a pre-defined atom, a binary or a charlist
Character encoding scheme. :ascii encoding uses cross-product character pairs.
Functions
charlist for a pre-defined Puid.Chars, a String.t() or a charlist.
Same as charlist/1 but either returns charlist or raises a Puid.Error
Calculate entropy metrics for a character set.
List of predefined charsets discovered from compiled module.
Types
Functions
@spec charlist(puid_chars()) :: {:ok, charlist()} | {:error, String.t()}
charlist for a pre-defined Puid.Chars, a String.t() or a charlist.
The characters for either String.t() or charlist types must be unique, have more than one character, and not be invalid ascii.
Example
iex> Puid.Chars.charlist(:safe32)
{:ok, ~c"2346789bdfghjmnpqrtBDFGHJLMNPQRT"}
iex> Puid.Chars.charlist("dingosky")
{:ok, ~c"dingosky"}
iex> Puid.Chars.charlist("unique")
{:error, "Characters not unique"}
  @spec charlist!(puid_chars()) :: charlist()
Same as charlist/1 but either returns charlist or raises a Puid.Error
Example
iex> Puid.Chars.charlist!(:safe32)
~c"2346789bdfghjmnpqrtBDFGHJLMNPQRT"
iex> Puid.Chars.charlist!("dingosky")
~c"dingosky"Raises Puid.Error if the characters are not unique, too few, or contain invalid characters.
@spec metrics(puid_chars()) :: %{ avg_bits: float(), bit_shifts: [{non_neg_integer(), pos_integer()}, ...], ere: float(), ete: float() }
Calculate entropy metrics for a character set.
Return Value
Returns a map with the following keys:
:avg_bits- Average bits consumed per character:bit_shifts- Bit shift rules used for character generation:ere- Entropy representation efficiency (0 < ERE ≤ 1.0), measures how efficiently the characters represent entropy in their string form:ete- Entropy transform efficiency (0 < ETE ≤ 1.0), measures how efficiently random bits are transformed into characters during generation
Examples
iex> Puid.Chars.metrics(:safe64)
%{
  avg_bits: 6.0,
  bit_shifts: [{63, 6}],
  ere: 0.75,
  ete: 1.0
}
iex> Puid.Chars.metrics(:alpha)
%{
  avg_bits: 6.769230769230769,
  bit_shifts: [{51, 6}, {55, 4}, {63, 3}],
  ere: 0.7125549647676365,
  ete: 0.8421104129072068
}Details
ERE: Entropy representation efficiency (0 < ERE ≤ 1.0), measures how efficiently ID characters represent entropy in their string form. For Puid this is always equivalent to the bits per character.
ETE: Entropy transform efficiency (0 < ETE ≤ 1.0). Character sets with a power-of-2 number of characters have ETE = 1.0 since bit slicing always creates a proper index into the characters list. Other character sets discard some bits due to bit slicing that creates an out-of-bounds index. Puid uses an algorithm which minimizes the number of bits discarded.
avg_bits: Theoretical average bits consumed per character
:bit_shifts: Bit shift values used to determine how many bits are discarded during bit slicing.
@spec predefined() :: [atom()]
List of predefined charsets discovered from compiled module.