Comparison

Compared to many other libraries in this space, the main focus of this one is measuring the real width of things when printed to terminal emulators. This library does not try to guess what font shaping (especially ligatures) would do, and cares less about the way things look like in other monospace contexts, such as using a modern rendering pipeline (e.g. in a browser) but with a monospace font. I encourage you to use your font rendering library directly, or find a way to layout things without needing to know the size of your text.

Unicode version

The data files are taken from the Unicode 16 release (Sep 2024).

Algorithm

Components (codepoints, control sequences and grapheme clusters) are classified into 4 catgories:

Control characters and ansi escape sequences are ignored and reported as zero-width, except:

For the remaining string, 3 different modes are supported: wcwidth, mode_2027 and mode2027_ext.

Default mode (wcwidth)

By default, this library tries to match the output of the wcwidth function from glibc. This function is still widely used by the majority of terminal emulators.

In this mode, the width of a string is the sum of the widths of its individual code points. (Extended) grapheme cluster boundaries are ignored.

(Extended) Mode 2027

Mode 2027 is a proposed mode for terminal emulators that applications can request. When active and supported, the terminal emulator is supposed to follow the terminal unicode core spec. This mode is supported by some terminals, and even the default and only behaviour in some others.

None of the terminals I tested implement the proposed spec fully, and the exact behaviour is subject to ongoing discussion in the freedesktop.org terminal working group. Support for this mode should therefore be considered best-effort at the moment, especially when support for Brahmic, Arabic, and some east asian scripts is required. If you just want your emoji family to be 2 columns wide, this mode works well enough right now.

This library does the following:

Additionally, in the extended mode, the following rules apply:

This rule makes the width reported by this library across Erlang and Nodejs more stable (Unicode version differences, see above), as well as closer match the behaviour of actual terminals with mode 2027 support. Starting with Unicode 15.1, some sequences that ocupy multiple columns are now segmented into single grapheme clusters.

Width of a single codepoint

The following codepoints are classified as zero-with:

The following codepoints are classified as wide:

The following codepoints are classified as ambiguous:

All other codepoints are narrow.

Testing

I currently test against VTE (mainly gnome-terminal), Windows Terminal, Kitty, and foot for mode 2027 support. I also test on Contour since they originally proposed the mode 2027 spec; however, they use a custom Unicode library that I don’t trust fully.

When reporting a mismatch, please include which terminal and the escaped codepoints. Gitlab/Discord sometimes like to strip certain modifiers.

Search Document