View Source Xray (Xray v1.2.0)
Xray offers utility functions for inspecting string binaries, their code points, and their base2 representations.
This package was the result of my own studying of Elixir strings and binaries. It's unlikely you would actually use this as a dependency, but I offer it up for public use in the hopes that it may be educational.
Link to this section Summary
Functions
Reveals the integer codepoint for the given single character; when run with
the default options, this is equivalent to the question-mark operator, e.g.
?x
but this function works with variables (whereas the question mark only
evaluates literal characters).
Given a string binary, this returns a list of the codepoints that represent
each of the characters in the string. This is what you might expect
String.codepoints/1
to return, but instead of returning a list of the
component characters, this function returns the numbers (which is what
code points are).
This function prints a report on the provided input string. This may not work especially well when the input contains non-printable characters (YMMV).
Link to this section Functions
Reveals the integer codepoint for the given single character; when run with
the default options, this is equivalent to the question-mark operator, e.g.
?x
but this function works with variables (whereas the question mark only
evaluates literal characters).
options
Options:
as_hex-boolean-default-false
:as_hex
(boolean) default: false
When true, returns the hexidecimal representation of the codepoint number. The hexidecimal representation is useful when looking up documentation, e.g. on Wikipedia or on websites like codepoints.net.
examples
Examples
iex> Xray.codepoint("ä")
228
iex> Xray.codepoint("ä", as_hex: true)
"00E4"
Given a string binary, this returns a list of the codepoints that represent
each of the characters in the string. This is what you might expect
String.codepoints/1
to return, but instead of returning a list of the
component characters, this function returns the numbers (which is what
code points are).
Note that this function returns a string: if a list is returned, Elixir will usually attempt to format it as a human-readable string, which defeats the purpose of the inspection.
This function offers output similar to what IO.inspect/2
when the :as_lists
option set to true
options
Options
:as_hex
(seecodepoint/2
)
examples
Examples
iex> Xray.codepoints("cät")
"99, 228, 116"
Compare this to inspecting a single-quoted charlist:
iex> IO.inspect('cät', charlists: :as_lists)
[99, 228, 116]
But IO.inspect
will send output to STDOUT.
This function prints a report on the provided input string. This may not work especially well when the input contains non-printable characters (YMMV).
For each character in the string, the following information is shown:
- code point as a decimal, e.g.
228
- code point in its Elixir Unicode representation, e.g.
\u00E4
- a link to a page containing more information about this Unicode code point
- count of the number of bytes required to represent this code point using UTF-8 encoding
- an inspection of the UTF-8 binaries, e.g.
<<195, 164>>
- a
Base2
representation (i.e. 1's and 0's) of the encoded code point
The Base2
representation (what we would be tempted to call the "binary" representation)
highlights control bits in red to help show how UTF-8
identifies how many bytes are required to encode each character.
examples
Examples
iex> Xray.inspect("cät")
======================================================
Input String: cät
Character Count: 3
Byte Count: 4
Is valid? true
Is printable? true
======================================================
c Codepoint: 99 (\u0063) https://codepoints.net/U+0063
Script(s): latin
Byte Count: 1
UTF-8: <<99>>
Base2: 01100011
ä Codepoint: 228 (\u00E4) https://codepoints.net/U+00E4
Script(s): latin
Byte Count: 2
UTF-8: <<195, 164>>
Base2: 11000011 10100100
t Codepoint: 116 (\u0074) https://codepoints.net/U+0074
Script(s): latin
Byte Count: 1
UTF-8: <<116>>
Base2: 01110100