str
Unicode-aware string utilities for Gleam
Production-ready Gleam library providing Unicode-aware string operations with a focus on grapheme-cluster correctness, pragmatic ASCII transliteration, and URL-friendly slug generation.
β¨ Features
| Category | Highlights |
| π― Grapheme-Aware | All operations correctly handle Unicode grapheme clusters (emoji, ZWJ sequences, combining marks) |
| π€ Case Conversions | snake_case, camelCase, kebab-case, PascalCase, Title Case, capitalize |
| π Slug Generation | Configurable slugify with token limits, custom separators, and Unicode preservation |
| π Search & Replace | index_of, last_index_of, replace_first, replace_last, contains_any/all |
| β
Validation | is_uppercase, is_lowercase, is_title_case, is_ascii, is_hex, is_numeric, is_alpha |
| π‘οΈ Escaping | escape_html, unescape_html, escape_regex |
| π Similarity | Levenshtein distance, percentage similarity, hamming_distance |
| π§© Splitting | splitn, partition, rpartition, chunk, lines, words |
| π Padding | pad_left, pad_right, center, fill |
| π Zero Dependencies | Pure Gleam implementation with no OTP requirement |
π¦ Installation
gleam add str
π Quick Start
import str/core
import str/extra
pub fn main() {
// π― Grapheme-safe truncation preserves emoji
let text = "Hello π©βπ©βπ§βπ¦ World"
core.truncate(text, 10, "...")
// β "Hello π©βπ©βπ§βπ¦..."
// π ASCII transliteration and slugification
extra.slugify("CrΓ¨me BrΓ»lΓ©e β Recipe 2025!")
// β "creme-brulee-recipe-2025"
// π€ Case conversions
extra.to_camel_case("hello world") // β "helloWorld"
extra.to_snake_case("Hello World") // β "hello_world"
core.capitalize("hELLO wORLD") // β "Hello world"
// π Grapheme-aware search
core.index_of("π¨βπ©βπ§βπ¦ family test", "family")
// β Ok(2) - counts grapheme clusters, not bytes!
// π String similarity
core.similarity("hello", "hallo")
// β 0.8 (80% similar)
// π‘οΈ HTML escaping
core.escape_html("<script>alert('xss')</script>")
// β "<script>alert('xss')</script>"
}
π API Reference
π€ Case & Capitalization
| Function | Example | Result |
capitalize(text) | "hELLO wORLD" | "Hello world" |
swapcase(text) | "Hello World" | "hELLO wORLD" |
is_uppercase(text) | "HELLO123" | True |
is_lowercase(text) | "hello_world" | True |
is_title_case(text) | "Hello World" | True |
βοΈ Grapheme Extraction
| Function | Example | Result |
take(text, n) | take("π¨βπ©βπ§βπ¦abc", 2) | "π¨βπ©βπ§βπ¦a" |
drop(text, n) | drop("hello", 2) | "llo" |
take_right(text, n) | take_right("hello", 3) | "llo" |
drop_right(text, n) | drop_right("hello", 2) | "hel" |
at(text, index) | at("hello", 1) | Ok("e") |
chunk(text, size) | chunk("abcdef", 2) | ["ab", "cd", "ef"] |
π Search & Replace
| Function | Example | Result |
index_of(text, needle) | "hello world", "world" | Ok(6) |
last_index_of(text, needle) | "hello hello", "hello" | Ok(6) |
contains_any(text, needles) | "hello", ["x", "e", "z"] | True |
contains_all(text, needles) | "hello", ["h", "e"] | True |
replace_first(text, old, new) | "aaa", "a", "b" | "baa" |
replace_last(text, old, new) | "aaa", "a", "b" | "aab" |
π§© Splitting & Partitioning
| Function | Example | Result |
partition(text, sep) | "a-b-c", "-" | #("a", "-", "b-c") |
rpartition(text, sep) | "a-b-c", "-" | #("a-b", "-", "c") |
splitn(text, sep, n) | "a-b-c-d", "-", 2 | ["a", "b-c-d"] |
words(text) | "hello world" | ["hello", "world"] |
lines(text) | "a\nb\nc" | ["a", "b", "c"] |
π Padding & Filling
| Function | Example | Result |
pad_left(text, width, pad) | "42", 5, "0" | "00042" |
pad_right(text, width, pad) | "hi", 5, "*" | "hi***" |
center(text, width, pad) | "hi", 6, "-" | "--hi--" |
fill(text, width, pad, pos) | "x", 5, "-", "both" | "--x--" |
β
Validation
| Function | Description |
is_numeric(text) | Digits only (0-9) |
is_alpha(text) | Letters only (a-z, A-Z) |
is_alphanumeric(text) | Letters and digits |
is_ascii(text) | ASCII only (0x00-0x7F) |
is_printable(text) | Printable ASCII (0x20-0x7E) |
is_hex(text) | Hexadecimal (0-9, a-f, A-F) |
is_blank(text) | Whitespace only |
is_title_case(text) | Title Case format |
π Prefix & Suffix
| Function | Example | Result |
remove_prefix(text, prefix) | "hello world", "hello " | "world" |
remove_suffix(text, suffix) | "file.txt", ".txt" | "file" |
ensure_prefix(text, prefix) | "world", "hello " | "hello world" |
ensure_suffix(text, suffix) | "file", ".txt" | "file.txt" |
starts_with_any(text, list) | "hello", ["hi", "he"] | True |
ends_with_any(text, list) | "file.txt", [".txt", ".md"] | True |
common_prefix(strings) | ["abc", "abd"] | "ab" |
common_suffix(strings) | ["abc", "xbc"] | "bc" |
π‘οΈ Escaping
| Function | Example | Result |
escape_html(text) | "<div>" | "<div>" |
unescape_html(text) | "<div>" | "<div>" |
escape_regex(text) | "a.b*c" | "a\\.b\\*c" |
π Similarity & Distance
| Function | Example | Result |
distance(a, b) | "kitten", "sitting" | 3 |
similarity(a, b) | "hello", "hallo" | 0.8 |
hamming_distance(a, b) | "karolin", "kathrin" | Ok(3) |
π Text Manipulation
| Function | Description |
truncate(text, len, suffix) | Truncate with emoji preservation |
ellipsis(text, len) | Truncate with β¦ |
reverse(text) | Grapheme-aware reversal |
reverse_words(text) | Reverse word order |
initials(text) | Extract initials ("John Doe" β "JD") |
normalize_whitespace(text) | Collapse whitespace |
strip(text, chars) | Remove chars from ends |
squeeze(text, char) | Collapse consecutive chars |
chomp(text) | Remove trailing newline |
π Line Operations
| Function | Description |
lines(text) | Split into lines |
dedent(text) | Remove common indentation |
indent(text, spaces) | Add indentation |
wrap_at(text, width) | Word wrap |
π€ Extra Module (str/extra)
Case Conversions
import str/extra
extra.to_snake_case("Hello World") // β "hello_world"
extra.to_camel_case("hello world") // β "helloWorld"
extra.to_pascal_case("hello world") // β "HelloWorld"
extra.to_kebab_case("Hello World") // β "hello-world"
extra.to_title_case("hello world") // β "Hello World"
ASCII Folding (Deburr)
extra.ascii_fold("CrΓ¨me BrΓ»lΓ©e") // β "Creme Brulee"
extra.ascii_fold("straΓe") // β "strasse"
extra.ascii_fold("Γ¦on") // β "aeon"
Slug Generation
extra.slugify("Hello, World!") // β "hello-world"
extra.slugify_opts("one two three", 2, "-", False) // β "one-two"
extra.slugify_opts("Hello World", 0, "_", False) // β "hello_world"
ποΈ Module Structure
str/
βββ core # Grapheme-aware core utilities (67 functions)
βββ extra # ASCII folding, slugs, case conversions
βββ tokenize # Pure-Gleam tokenizer (reference)
βββ internal_* # Character tables (internal)
π Documentation
β‘ Optional OTP Integration
The library core is OTP-free by design. For production Unicode normalization (NFC/NFD):
// In your application code:
pub fn otp_nfd(s: String) -> String {
// Call Erlang's :unicode module
s
}
// Use with str:
extra.ascii_fold_with_normalizer("Crème", otp_nfd)
extra.slugify_with_normalizer("CafΓ©", otp_nfd)
π§ͺ Development
# Run the test suite (325 tests)
gleam test
# Regenerate character tables documentation
python3 scripts/generate_character_tables.py
π Test Coverage
- 325 tests covering all public functions
- Unicode edge cases (emoji, ZWJ, combining marks)
- Grapheme cluster boundary handling
- Cross-module integration tests
π€ Contributing
Contributions welcome! Areas for improvement:
- Expanding character transliteration tables
- Additional test cases for edge cases
- Documentation improvements
- Performance optimizations
gleam test # Ensure tests pass before submitting PRs
π License
MIT License β see LICENSE for details.
π Links
Made with π for the Gleam community