Puid (puid v2.3.2)
Simple, fast, flexible and efficient generation of probably unique identifiers (puid
, aka
random strings) of intuitively specified entropy using pre-defined or custom characters.
Overview
Puid
provides fast and efficient generation of random IDs. For the purposes of Puid
, a random
ID is considered a random string used in a context of uniqueness, that is, random IDs are a bunch
of random strings that are hopefully unique.
Random string generation can be thought of as a transformation of some random source of entropy into a string representation of randomness. A general purpose random string library used for random IDs should therefore provide user specification for each of the following three key aspects:
Entropy source
What source of randomness is being transformed? Puid
allows easy specification of the function
used for source randomness.
ID characters
What characters are used in the ID? Puid
provides 16 pre-defined character sets, as well as
allows custom character designation, including Unicode
ID randomness
What is the resulting “randomness” of the IDs? Note this isn't necessarily the same as the
randomness of the entropy source. Puid
allows explicit specification of ID randomness in an
intuitive manner.
Examples
Creating a random ID generator using Puid
is a simple as:
iex> defmodule(RandId, do: use(Puid))
iex> RandId.generate()
"8nGA2UaIfaawX-Og61go5A"
Options allow easy and complete control of ID generation.
Entropy Source
Puid
uses
:crypto.strong_rand_bytes/1 as
the default entropy source. The rand_bytes
option can be used to specify any function of the
form (non_neg_integer) -> binary
as the source:
iex > defmodule(PrngPuid, do: use(Puid, rand_bytes: &:rand.bytes/1))
iex> PrngPuid.generate()
"bIkrSeU6Yr8_1WHGvO0H3M"
ID Characters
By default, Puid
use the RFC 4648 file system &
URL safe characters. The chars
option can by used to specify any of 16 pre-defined character
sets or custom characters, including Unicode:
iex> defmodule(HexPuid, do: use(Puid, chars: :hex))
iex> HexPuid.generate()
"13fb81e35cb89e5daa5649802ad4bbbd"
iex> defmodule(DingoskyPuid, do: use(Puid, chars: "dingosky"))
iex> DingoskyPuid.generate()
"yiidgidnygkgydkodggysonydodndsnkgksgonisnko"
iex> defmodule(DingoskyUnicodePuid, do: use(Puid, chars: "dîñgø$kyDÎÑGØßK¥", total: 2.5e6, risk: 1.0e15))
iex> DingoskyUnicodePuid.generate()
"øßK$ggKñø$dyGîñdyØøØÎîk"
ID Randomness
Generated IDs have 128-bit entropy by default. Puid
provides a simple, intuitive way to specify
ID randomness by declaring a total
number of possible IDs with a specified risk
of a repeat in
that many IDs:
To generate up to 10 million random IDs with 1 in a trillion chance of repeat:
iex> defmodule(MyPuid, do: use(Puid, total: 10.0e6, risk: 1.0e15))
iex> MyPuid.generate()
"T0bFZadxBYVKs5lA"
The bits
option can be used to directly specify an amount of ID randomness:
iex> defmodule(Token, do: use(Puid, bits: 256, chars: :hex_upper))
iex> Token.generate()
"6E908C2A1AA7BF101E7041338D43B87266AFA73734F423B6C3C3A17599F40F2A"
Module API
Module functions:
- generate/0: Generate a random puid
- total/1: total puids which can be generated at a specified
risk
- risk/1: risk of generating
total
puids - encode/1: Encode
bytes
into a puid - decode/1: Decode a
puid
into bytes - info/0: Module information
The total/1
, risk/1
functions provide approximations to the risk of a repeat in some total number of generated puids. The mathematical approximations used purposely overestimate risk and underestimate total.
The encode/1
, decode/1
functions convert puids to and from bits to facilitate binary data storage, e.g. as an Ecto type. Note that for efficiency Puid
operates at a bit level, so decode/1
of a puid produces representative bytes such that encode/1
of those bytes produces the same puid. The bytes are the puid specific bitstring with 0 bit values appended to the ending byte boundary.
The info/0
function returns a Puid.Info
structure consisting of:
- source characters
- name of pre-defined
Puid.Chars
or:custom
- entropy bits per character
- total entropy bits
- may be larger than the specified
bits
since it is a multiple of the entropy bits per character - entropy representation efficiency
- ratio of the puid entropy to the bits required for puid string representation
- entropy source function
- puid string length
Example
iex> defmodule(SafeId, do: use(Puid))
iex> SafeId.generate()
"CSWEPL3AiethdYFlCbSaVC"
iex> SafeId.total(1_000_000)
104350568690606000
iex> SafeId.risk(1.0e12)
9007199254740992
iex> SafeId.decode("CSWEPL3AiethdYFlCbSaVC")
<<9, 37, 132, 60, 189, 192, 137, 235, 97, 117, 129, 101, 9, 180, 154, 84, 32>>
iex> SafeId.encode(<<9, 37, 132, 60, 189, 192, 137, 235, 97, 117, 129, 101, 9, 180, 154, 84, 32>>)
"CSWEPL3AiethdYFlCbSaVC"
iex> SafeId.info()
%Puid.Info{
characters: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_",
char_set: :safe64,
entropy_bits: 132.0,
entropy_bits_per_char: 6.0,
ere: 0.75,
length: 22,
rand_bytes: &:crypto.strong_rand_bytes/1
}
Summary
Types
@type t() :: binary()