ToonEx.Encode.Strings (toon_ex v0.8.1)

Copy Markdown View Source

String encoding utilities for TOON format.

Handles quote detection, escaping, and key validation.

Performance

Uses Jason-style chunk-based escaping with binary_part/3. Instead of copying every byte into a new binary, this approach:

  1. Scans the input for bytes that need escaping
  2. Uses binary_part/3 to reference safe chunks without copying
  3. Builds an iodata list with chunk references and escape sequences

This significantly reduces allocations for strings with few escape characters. The binary_part/3 call is O(1) — it creates a sub-binary reference rather than copying the underlying data. Only when IO.iodata_to_binary/1 is called at the top level does the final contiguous binary get allocated.

Summary

Functions

Encodes a key, adding quotes if necessary.

Encodes a string value, adding quotes if necessary.

Escapes special characters in a string using chunk-based approach.

Checks if a string can be used as an unquoted key.

Checks if a string can be used unquoted as a value.

Functions

encode_key(key)

@spec encode_key(String.t()) :: iodata()

Encodes a key, adding quotes if necessary.

Keys have stricter requirements than values:

  • Must match /^[A-Z_][\w.]*$/i (alphanumeric, underscore, dot)
  • Numbers-only keys must be quoted
  • Keys with special characters must be quoted

Returns iodata that can be converted to a string with IO.iodata_to_binary/1.

Examples

iex> ToonEx.Encode.Strings.encode_key("name") |> IO.iodata_to_binary()
"name"

iex> ToonEx.Encode.Strings.encode_key("user_name") |> IO.iodata_to_binary()
"user_name"

iex> ToonEx.Encode.Strings.encode_key("user.name") |> IO.iodata_to_binary()
"user.name"

iex> ToonEx.Encode.Strings.encode_key("user name") |> IO.iodata_to_binary()
~s("user name")

iex> ToonEx.Encode.Strings.encode_key("123") |> IO.iodata_to_binary()
~s("123")

encode_string(string, delimiter \\ ",")

@spec encode_string(String.t(), String.t()) :: iodata()

Encodes a string value, adding quotes if necessary.

Returns iodata that can be converted to a string with IO.iodata_to_binary/1.

Examples

iex> ToonEx.Encode.Strings.encode_string("hello") |> IO.iodata_to_binary()
"hello"

iex> ToonEx.Encode.Strings.encode_string("") |> IO.iodata_to_binary()
~s("")

iex> ToonEx.Encode.Strings.encode_string("hello world") |> IO.iodata_to_binary()
"hello world"

iex> ToonEx.Encode.Strings.encode_string("line1\nline2") |> IO.iodata_to_binary()
~s("line1\\nline2")

escape_string(data)

@spec escape_string(String.t()) :: iodata()

Escapes special characters in a string using chunk-based approach.

Instead of copying every byte into a new binary, this uses binary_part/3 to reference safe chunks of the original string without copying. Only the escape sequences are newly allocated.

How it works

The algorithm uses two mutually recursive functions:

  1. escape_string/4 — main loop that scans for bytes needing escaping
  2. escape_string_chunk/5 — accumulates consecutive safe bytes into a chunk

When a safe byte is encountered, we enter chunk mode and keep extending the chunk length. When we hit a byte that needs escaping, we flush the accumulated chunk via binary_part(original, skip, len) (O(1) reference), append the escape sequence, and continue scanning.

Examples

iex> ToonEx.Encode.Strings.escape_string("hello") |> IO.iodata_to_binary()
"hello"

iex> ToonEx.Encode.Strings.escape_string("line1\nline2") |> IO.iodata_to_binary()
"line1\\nline2"

iex> result = ToonEx.Encode.Strings.escape_string(~s(say "hello"))
iex> IO.iodata_to_binary(result) |> String.contains?(~s(\"))
true

iex> ToonEx.Encode.Strings.escape_string("") |> IO.iodata_to_binary()
""

safe_key?(arg1)

@spec safe_key?(String.t()) :: boolean()

Checks if a string can be used as an unquoted key.

A key is safe if it matches /^[A-Za-z][A-Za-z0-9.]*$/i

Examples

iex> ToonEx.Encode.Strings.safe_key?("name")
true

iex> ToonEx.Encode.Strings.safe_key?("user_name")
true

iex> ToonEx.Encode.Strings.safe_key?("User123")
true

iex> ToonEx.Encode.Strings.safe_key?("user.name")
true

iex> ToonEx.Encode.Strings.safe_key?("user-name")
false

iex> ToonEx.Encode.Strings.safe_key?("123")
false

safe_unquoted?(string, delimiter)

@spec safe_unquoted?(String.t(), String.t()) :: boolean()

Checks if a string can be used unquoted as a value.

A string is safe unquoted if:

  • It's not empty
  • It doesn't have leading or trailing spaces
  • It's not a literal (true, false, null)
  • It doesn't look like a number
  • It doesn't contain structure characters or delimiters
  • It doesn't contain control characters
  • It doesn't start with a hyphen

Examples

iex> ToonEx.Encode.Strings.safe_unquoted?("hello", ",")
true

iex> ToonEx.Encode.Strings.safe_unquoted?("", ",")
false

iex> ToonEx.Encode.Strings.safe_unquoted?(" hello", ",")
false

iex> ToonEx.Encode.Strings.safe_unquoted?("true", ",")
false

iex> ToonEx.Encode.Strings.safe_unquoted?("42", ",")
false