Multibase
This library is an Elixir implementation of Multibase.
This implementation provides an Elixir-centric interface to Multibase and several helper functions for making the process as painless as possible. Further, it aggregates a collection of clean, pragmatic, and reasonably fast (in Elixir terms) encoders and decoders.
Multibase provides a simple way of encoding data by tagging it with the given encoding method. This allows encoding and decoding safely and accurately within a known set of encodings. Multibase ensures that the encoding type is always known, human readable, and that data can be transparently encoded and decoded in a consistent way.
In particular, multibase is especially relevant when sending data over a network and/or interacting with other programs. Instead of assuming or negotiating a convention for data manually, the intent is transmitted in-band. This facilitates quick, safe, and predictable changes should requirements such as encoding type or settings change. While not full-proof, Multibase offers a light-weight interface that eases debugging, testing, and multi-format handling.
From the Multibase README:
Multibase is a protocol for distinguishing base encodings and other simple string encodings, and for ensuring full compatibility with program interfaces. It answers the question:
Given data d encoded into string s, how can I tell what base d is encoded with?
Base encodings exist because transports have restrictions, use special in-band sequences, or must be human-friendly. When systems chose a base to use, it is not always clear which base to use, as there are many tradeoffs in the decision. Multibase is here to save programs and programmers from worrying about which encoding is best. It solves the biggest problem: a program can use multibase to take input or produce output in whichever base is desired. The important part is that the value is self-describing, letting other programs elsewhere know what encoding it is using.
Multibase prefixes data with a given base encoding identifier (a varint). The format is as follows:
<varint-base-encoding-code><base-encoded-data>
Supported Encodings
The following table lists the currently supported Multibase encodings. All encodings (22) are currently supported by this library. Be aware that this list can and probably will be updated in the Multibase spec.
Each encoding has an accompanying prefix code. An upper-case code signifies upper-encoding/decoding, and a lower-case code signifies a lower-case encoding/decoding.
encoding | code | name | encoding ids |
---|---|---|---|
identity | 0x00 | 8-bit binary (encoder and decoder keeps data unmodified) | :identity |
base1 | 1 | unary tends to be 11111 | :base1 |
base2 | 0 | binary has 1 and 0 | :base2 |
base8 | 7 | highest char in octal | :base8 |
base10 | 9 | highest char in decimal | :base10 |
base16 | F,f | highest char in hex | :base16_upper , :base16_lower |
base32hex | V,v | rfc4648 no padding - highest char | :base32_hex_upper , :base32_hex_lower |
base32hexpad | T,t | rfc4648 with padding | :base32_hex_pad_upper , :base32_hex_pad_lower |
base32 | B,b | rfc4648 no padding | :base32_upper , :base32_lower |
base32pad | C,c | rfc4648 with padding | :base32_pad_upper , :base32_pad_lower |
base32z | h | z-base-32 - used by Tahoe-LAFS - highest letter | :base32_z |
base58flickr | Z | highest letter | :base58_flickr |
base58btc | z | highest letter | :base58_btc |
base64 | m | rfc4648 no padding | :base64 |
base64pad | M | rfc4648 with padding - MIME encoding | :base64_pad |
base64url | u | rfc4648 no padding | :base64_url |
base64urlpad | U | rfc4648 with padding | :base64_url_pad |
Additional encodings can be added as necessary via a small update to a declarative map in the Multibase
module.
Why?
Human-friendly encoding across multiple bases
- Simpler debugging and auditing of encoded data
Single consistent interface for encoding and decoding popular encodings in typical real-world Bases
- Same functions for encoding and decoding all bases
- Same return types for all functions regardless of encoding
Collection of benchmarked, reasonably fast, pragmatic, and consistent interface encoders and decoders
- Manually rolled encodings where benefits exist
- Avoids generic “BaseXXX” style encoding that can be error-prone, inefficient, or not one-size fits all for all bases
- Decode data without mental juggling of the encoding type from elsewhere in the code
- Work with technologies that use or support Multibase such as IPFS, CID, etc.
Simplify encoding and decoding interfaces to negate need to pass options and other parameters to ensure the correct encoding values
- Ex: No worries about forgetting to pass a
padding
orcase
parameter.
- Ex: No worries about forgetting to pass a
Data-driven approach for Bases
- Simple map update to add a new Base at compile time
- Easy to write adapters
- Does not force you to even use the same encoder and decoder modules
- Explicit
Usage
Full API Documentation can be found at https://hexdocs.pm/multibase/.
First, let’s audit the current version to see what kind of encodings are available:
Multibase.encodings()
[:identity, :base1, :base2, :base8, :base10, :base16_upper, :base16_lower,
:base32_hex_upper, :base32_hex_lower, :base32_hex_pad_upper,
:base32_hex_pad_lower, :base32_upper, :base32_lower, :base32_pad_upper,
:base32_pad_lower, :base32_z, :base58_flickr, :base58_btc, :base64,
:base64_pad, :base64_url, :base64_url_pad]
There are 22 encodings. Each is represented by a unique encoding_id
atom.
The Multibase
module encapsulates the Main API. It typically provides 2 versions of most functions. The first form is the typical {:ok, result}
and :error
or {:error, reason}
. The !
suffixed functions will raise exceptions, typically if a bad encoding_id
is passed or another error is encountered.
Encoding data using a variety of different encodings:
# Let's consider some data to encode. We start with a simple Elixir binary.
data = "I can be encoded many ways, but I am unique"
# We call `encode!/1` and pass an atom representing the encoding type as an ID
Multibase.encode!(data, :base16_lower)
"f492063616e20626520656e636f646564206d616e7920776179732c20627574204920616d20756e69717565"
Multibase.encode!(data, :base8)
"71111006154133420142312201453346155731062544100665413347444035660571346260403047256410044440302664403526715134272545"
Multibase.encode!(data, :base32_hex_upper)
"V94G66OBE41H6A835DPHMUP35CGG6QOBEF4G7EOBPECM20OJLEGG4I831DKG7ARJ9E5QMA"
# We can also call a pattern matching friendly version
Multibase.encode(data, :base58_btc)
{:ok, "z6PS9nHyn6kM1ECybTAjN4iAmtekMSSjXbisXp5xTBsmcLsRsYY85Z1Ko1vL"}
# If we pass bad data, that's handled for us too
# Let's pass an encoding that clearly does not exist
Multibase.encode(data, :all_your_bases)
{:error, :unsupported_encoding}
# Let's again do the same, but using the `!` version
Multibase.encode!(data, :all_your_bases)
# ** (ArgumentError) Unsupported encoding - no encodings for encoding id: :all_your_bases
Decoding data is simple with Multibase. We can skip passing the encoding because it’s already in the data, otherwise it’s not Multibase binary.
# Let's decode the data we encoded above
Multibase.decode!("f492063616e20626520656e636f646564206d616e7920776179732c20627574204920616d20756e69717565")
"I can be encoded many ways, but I am unique"
# Again we have 2 versions of the function
Multibase.decode("z6PS9nHyn6kM1ECybTAjN4iAmtekMSSjXbisXp5xTBsmcLsRsYY85Z1Ko1vL")
{:ok, "I can be encoded many ways, but I am unique"}
# Suppose we want to know what encoding was used to encode as part of the decoding process
Multibase.codec_decode!("V94G66OBE41H6A835DPHMUP35CGG6QOBEF4G7EOBPECM20OJLEGG4I831DKG7ARJ9E5QMA")
{"I can be encoded many ways, but I am unique", :base32_hex_upper}
Multibase.codec_decode("71111006154133420142312201453346155731062544100665413347444035660571346260403047256410044440302664403526715134272545")
{:ok, {"I can be encoded many ways, but I am unique", :base8}}
# error handling works as expected
Multibase.decode("~$#%@$%gibberish")
:error
# Let's tamper with some base58 data by inserting a non-alphabet character.
# The given decoder will bubble up that this data is no good
Multibase.decode("z6PS9nHyn6kM1ECybTAjN4iAmtekMSSjXbisXp5xTBsmcLsRsYY85Z1Ko1vLTAMPERED0")
:error
Suppose we are lazy and just want to query what’s available as Multibase grows, or we want to encode using several encodings.
We can easily query the encodings list:
Multibase.encodings_for!(:base32)
[:base32_hex_pad_upper, :base32_hex_pad_lower, :base32_upper, :base32_lower,
:base32_pad_upper, :base32_pad_lower, :base32_z]
# and the reverse
Multibase.encoding_family!(:base32_pad_upper)
:base32
# Or we want to know what prefix to expect, perhaps for testing, debugging, auditing, pattern matching, etc.
Multibase.prefix!(:base32_pad_lower)
"c"
Multibase.prefix(:identity)
{:ok, <<0>>}
We can also prefix already encoded data. This might be useful if you want to just use Multibase as an adapter or are doing encoding out-of-band. It’s much easier and safer to just encode with Multibase but nonetheless this capability is available should you need it.
# Suppose somewhere else we do this
b58_flickr_encoded_data = B58.encode58(data, alphabet: :flickr)
"6or9MhYM6Km1ecYAsaJn4HaLTDKmrrJwAHSwP5XsbSLBkSqSxx85y1jN1Vk"
# As long as we pick the right prefix, we should know
# As you might expect, this puts some burden on the code so we should prefer to use `encode/2` or `encode!/2`
Multibase.multibase(b58_flickr_encoded_data, :base58_flickr)
{:ok, "Z6or9MhYM6Km1ecYAsaJn4HaLTDKmrrJwAHSwP5XsbSLBkSqSxx85y1jN1Vk"}
# There's an exception raising version too
Multibase.multibase!(b58_flickr_encoded_data, :base58_flickr)
"Z6or9MhYM6Km1ecYAsaJn4HaLTDKmrrJwAHSwP5XsbSLBkSqSxx85y1jN1Vk"
Installation
Multibase is available via Hex. The package can be installed by adding multibase
to your list of dependencies in mix.exs
:
def deps do
[
{:multibase, "~> 0.0.1"}
]
end
API Documentation can be found at https://hexdocs.pm/multibase/.