str logo

str

Unicode-aware string utilities for Gleam

Package Version Hex Docs License: MIT

Production-ready Gleam library providing Unicode-aware string operations with a focus on grapheme-cluster correctness, pragmatic ASCII transliteration, and URL-friendly slug generation.


✨ Features

CategoryHighlights
🎯 Grapheme-AwareAll operations correctly handle Unicode grapheme clusters (emoji, ZWJ sequences, combining marks)
πŸ”€ Case Conversionssnake_case, camelCase, kebab-case, PascalCase, Title Case, capitalize
πŸ”— Slug GenerationConfigurable slugify with token limits, custom separators, and Unicode preservation
πŸ” Search & Replaceindex_of, last_index_of, replace_first, replace_last, contains_any/all
βœ… Validationis_uppercase, is_lowercase, is_title_case, is_ascii, is_hex, is_numeric, is_alpha
πŸ›‘οΈ Escapingescape_html, unescape_html, escape_regex
πŸ“ SimilarityLevenshtein distance, percentage similarity, hamming_distance
🧩 Splittingsplitn, partition, rpartition, chunk, lines, words
πŸ“ Paddingpad_left, pad_right, center, fill
πŸš€ Zero DependenciesPure Gleam implementation with no OTP requirement

πŸ“¦ Installation

gleam add str

πŸš€ Quick Start

import str/core
import str/extra

pub fn main() {
  // 🎯 Grapheme-safe truncation preserves emoji
  let text = "Hello πŸ‘©β€πŸ‘©β€πŸ‘§β€πŸ‘¦ World"
  core.truncate(text, 10, "...")
  // β†’ "Hello πŸ‘©β€πŸ‘©β€πŸ‘§β€πŸ‘¦..."

  // πŸ”— ASCII transliteration and slugification
  extra.slugify("Crème Brûlée — Recipe 2025!")
  // β†’ "creme-brulee-recipe-2025"

  // πŸ”€ Case conversions
  extra.to_camel_case("hello world")   // β†’ "helloWorld"
  extra.to_snake_case("Hello World")   // β†’ "hello_world"
  core.capitalize("hELLO wORLD")       // β†’ "Hello world"

  // πŸ” Grapheme-aware search
  core.index_of("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ family test", "family")
  // β†’ Ok(2) - counts grapheme clusters, not bytes!

  // πŸ“ String similarity
  core.similarity("hello", "hallo")
  // β†’ 0.8 (80% similar)
  
  // πŸ›‘οΈ HTML escaping
  core.escape_html("<script>alert('xss')</script>")
  // β†’ "&lt;script&gt;alert(&#39;xss&#39;)&lt;/script&gt;"
}

πŸ“š API Reference

πŸ”€ Case & Capitalization

FunctionExampleResult
capitalize(text)"hELLO wORLD""Hello world"
swapcase(text)"Hello World""hELLO wORLD"
is_uppercase(text)"HELLO123"True
is_lowercase(text)"hello_world"True
is_title_case(text)"Hello World"True

βœ‚οΈ Grapheme Extraction

FunctionExampleResult
take(text, n)take("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦abc", 2)"πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦a"
drop(text, n)drop("hello", 2)"llo"
take_right(text, n)take_right("hello", 3)"llo"
drop_right(text, n)drop_right("hello", 2)"hel"
at(text, index)at("hello", 1)Ok("e")
chunk(text, size)chunk("abcdef", 2)["ab", "cd", "ef"]

πŸ” Search & Replace

FunctionExampleResult
index_of(text, needle)"hello world", "world"Ok(6)
last_index_of(text, needle)"hello hello", "hello"Ok(6)
contains_any(text, needles)"hello", ["x", "e", "z"]True
contains_all(text, needles)"hello", ["h", "e"]True
replace_first(text, old, new)"aaa", "a", "b""baa"
replace_last(text, old, new)"aaa", "a", "b""aab"

🧩 Splitting & Partitioning

FunctionExampleResult
partition(text, sep)"a-b-c", "-"#("a", "-", "b-c")
rpartition(text, sep)"a-b-c", "-"#("a-b", "-", "c")
splitn(text, sep, n)"a-b-c-d", "-", 2["a", "b-c-d"]
words(text)"hello world"["hello", "world"]
lines(text)"a\nb\nc"["a", "b", "c"]

πŸ“ Padding & Filling

FunctionExampleResult
pad_left(text, width, pad)"42", 5, "0""00042"
pad_right(text, width, pad)"hi", 5, "*""hi***"
center(text, width, pad)"hi", 6, "-""--hi--"
fill(text, width, pad, pos)"x", 5, "-", "both""--x--"

βœ… Validation

FunctionDescription
is_numeric(text)Digits only (0-9)
is_alpha(text)Letters only (a-z, A-Z)
is_alphanumeric(text)Letters and digits
is_ascii(text)ASCII only (0x00-0x7F)
is_printable(text)Printable ASCII (0x20-0x7E)
is_hex(text)Hexadecimal (0-9, a-f, A-F)
is_blank(text)Whitespace only
is_title_case(text)Title Case format

πŸ”— Prefix & Suffix

FunctionExampleResult
remove_prefix(text, prefix)"hello world", "hello ""world"
remove_suffix(text, suffix)"file.txt", ".txt""file"
ensure_prefix(text, prefix)"world", "hello ""hello world"
ensure_suffix(text, suffix)"file", ".txt""file.txt"
starts_with_any(text, list)"hello", ["hi", "he"]True
ends_with_any(text, list)"file.txt", [".txt", ".md"]True
common_prefix(strings)["abc", "abd"]"ab"
common_suffix(strings)["abc", "xbc"]"bc"

πŸ›‘οΈ Escaping

FunctionExampleResult
escape_html(text)"<div>""&lt;div&gt;"
unescape_html(text)"&lt;div&gt;""<div>"
escape_regex(text)"a.b*c""a\\.b\\*c"

πŸ“ Similarity & Distance

FunctionExampleResult
distance(a, b)"kitten", "sitting"3
similarity(a, b)"hello", "hallo"0.8
hamming_distance(a, b)"karolin", "kathrin"Ok(3)

πŸ“ Text Manipulation

FunctionDescription
truncate(text, len, suffix)Truncate with emoji preservation
ellipsis(text, len)Truncate with …
reverse(text)Grapheme-aware reversal
reverse_words(text)Reverse word order
initials(text)Extract initials ("John Doe" β†’ "JD")
normalize_whitespace(text)Collapse whitespace
strip(text, chars)Remove chars from ends
squeeze(text, char)Collapse consecutive chars
chomp(text)Remove trailing newline

πŸ“„ Line Operations

FunctionDescription
lines(text)Split into lines
dedent(text)Remove common indentation
indent(text, spaces)Add indentation
wrap_at(text, width)Word wrap

πŸ”€ Extra Module (str/extra)

Case Conversions

import str/extra

extra.to_snake_case("Hello World")    // β†’ "hello_world"
extra.to_camel_case("hello world")    // β†’ "helloWorld"
extra.to_pascal_case("hello world")   // β†’ "HelloWorld"
extra.to_kebab_case("Hello World")    // β†’ "hello-world"
extra.to_title_case("hello world")    // β†’ "Hello World"

ASCII Folding (Deburr)

extra.ascii_fold("Crème Brûlée")  // → "Creme Brulee"
extra.ascii_fold("straße")        // β†’ "strasse"
extra.ascii_fold("Γ¦on")           // β†’ "aeon"

Slug Generation

extra.slugify("Hello, World!")                    // β†’ "hello-world"
extra.slugify_opts("one two three", 2, "-", False) // β†’ "one-two"
extra.slugify_opts("Hello World", 0, "_", False)   // β†’ "hello_world"

πŸ—οΈ Module Structure

str/
β”œβ”€β”€ core        # Grapheme-aware core utilities (67 functions)
β”œβ”€β”€ extra       # ASCII folding, slugs, case conversions
β”œβ”€β”€ tokenize    # Pure-Gleam tokenizer (reference)
└── internal_*  # Character tables (internal)

πŸ“– Documentation

DocumentDescription
Core APIGrapheme-aware string operations
Extra APIASCII folding and slug generation
TokenizerPure-Gleam tokenizer reference
ExamplesIntegration examples and OTP patterns
Character TablesMachine-readable transliteration data

⚑ Optional OTP Integration

The library core is OTP-free by design. For production Unicode normalization (NFC/NFD):

// In your application code:
pub fn otp_nfd(s: String) -> String {
  // Call Erlang's :unicode module
  s
}

// Use with str:
extra.ascii_fold_with_normalizer("Crème", otp_nfd)
extra.slugify_with_normalizer("CafΓ©", otp_nfd)

πŸ§ͺ Development

# Run the test suite (325 tests)
gleam test

# Regenerate character tables documentation
python3 scripts/generate_character_tables.py

πŸ“Š Test Coverage


🀝 Contributing

Contributions welcome! Areas for improvement:

gleam test  # Ensure tests pass before submitting PRs

πŸ“„ License

MIT License β€” see LICENSE for details.


πŸ”— Links


Made with πŸ’œ for the Gleam community

✨ Search Document