View Source Fuzler

A tiny, Rust‑powered string‑similarity helper for Elixir.

Fuzler gives you one public function:

Fuzler.similarity_score(query :: String.t(), target :: String.t()) :: float

It returns a normalised score in $0.0 – 1.0$ that tells you how closely two pieces of text match—robust to typos, word‑order swaps, case and basic punctuation.

Behind the scenes it calls a compiled Rust NIF that mixes:

  • Hamming distance – for very short, nearly equal‑length strings.
  • SIMD Levenshtein – fast edit distance from the triple_accel crate.
  • Token‑bag Jaccard – ignores word order.
  • Partial‑ratio window – finds the best‑matching snippet when the target is much longer than the query.

The result is symmetric (score(a,b) ≈ score(b,a)), length‑normalised and remains meaningful from single words to multi‑sentence paragraphs.


installation

Installation

Add to your mix.exs:

def deps do
  [
    {:fuzler, "~> 0.1.2"}
  ]
end

You need Rust ≥ 1.70 installed; rustler will compile the NIF automatically.


quick-examples

Quick examples

iex> Fuzler.similarity_score("ciao", "ciao")
1.0

iex> Fuzler.similarity_score("bella ciao", "ciao bella")
0.70       # same words, different order

iex> long_text = "bella ciao come va oggi spero che tu stia bene ..."
iex> Fuzler.similarity_score("ciao", long_text)
0.75       # query appears once inside a 40‑token paragraph

iex> Fuzler.similarity_score("bonjour", long_text)
0.12       # word not present

when-should-i-use-it

When should I use it?

Use caseWhy it works well
typo‑tolerant autocomplete / “did‑you‑mean”Hamming + Levenshtein catch small edits fast
matching short queries inside long blobswindowed partial ratio focuses on the best slice
order‑agnostic key comparisontoken‑bag Jaccard treats “ciao bella” = “bella ciao”
quick relevance scoring in Elixirpure NIF call, no external service needed

Not a full‑text search engine or a semantic synonym matcher—that’s what Tantivy / Embeddings are for.


api

API

@doc "Returns a similarity score ∈ [0.0, 1.0]"
@spec similarity_score(String.t(), String.t()) :: float

If the NIF failed to load you’ll get:

:erlang.nif_error(:nif_not_loaded)

so your code can decide to fall back or skip tests.


how-good-is-the-score

How good is the score?

Query / TargetScore ≈
identical strings (any case / punctuation)1.00
same words, swapped order0.68 – 0.72
one‑word query present once in 45‑token paragraph\~0.75
one‑word query absent from paragraph≤ 0.15
80‑token paragraph vs same with 1 typo≥ 0.90
“ciao bella” with +30 random filler tokens appended\~0.58

running-the-test-suite

Running the test suite

mix test runs a handful of ExUnit cases covering:

  • case & punctuation variations
  • word‑order permutations
  • query present / absent in long paragraph (> 40 tokens)
  • very long strings with tiny edits
  • monotonic drop as filler tokens grow

All similarity tests auto‑skip if the NIF isn’t loaded (e.g. on CI without Rust).


license

License

MIT License