View Source Fuzler
A tiny, Rust‑powered string‑similarity helper for Elixir.
Fuzler
gives you one public function:
Fuzler.similarity_score(query :: String.t(), target :: String.t()) :: float
It returns a normalised score in $0.0 – 1.0$ that tells you how closely two pieces of text match—robust to typos, word‑order swaps, case and basic punctuation.
Behind the scenes it calls a compiled Rust NIF that mixes:
- Hamming distance – for very short, nearly equal‑length strings.
- SIMD Levenshtein – fast edit distance from the
triple_accel
crate. - Token‑bag Jaccard – ignores word order.
- Partial‑ratio window – finds the best‑matching snippet when the target is much longer than the query.
The result is symmetric (score(a,b) ≈ score(b,a)
), length‑normalised and remains meaningful from single words to multi‑sentence paragraphs.
installation
Installation
Add to your mix.exs
:
def deps do
[
{:fuzler, "~> 0.1.2"}
]
end
You need Rust ≥ 1.70 installed; rustler
will compile the NIF automatically.
quick-examples
Quick examples
iex> Fuzler.similarity_score("ciao", "ciao")
1.0
iex> Fuzler.similarity_score("bella ciao", "ciao bella")
0.70 # same words, different order
iex> long_text = "bella ciao come va oggi spero che tu stia bene ..."
iex> Fuzler.similarity_score("ciao", long_text)
0.75 # query appears once inside a 40‑token paragraph
iex> Fuzler.similarity_score("bonjour", long_text)
0.12 # word not present
when-should-i-use-it
When should I use it?
Use case | Why it works well |
---|---|
typo‑tolerant autocomplete / “did‑you‑mean” | Hamming + Levenshtein catch small edits fast |
matching short queries inside long blobs | windowed partial ratio focuses on the best slice |
order‑agnostic key comparison | token‑bag Jaccard treats “ciao bella” = “bella ciao” |
quick relevance scoring in Elixir | pure NIF call, no external service needed |
Not a full‑text search engine or a semantic synonym matcher—that’s what Tantivy / Embeddings are for.
api
API
@doc "Returns a similarity score ∈ [0.0, 1.0]"
@spec similarity_score(String.t(), String.t()) :: float
If the NIF failed to load you’ll get:
:erlang.nif_error(:nif_not_loaded)
so your code can decide to fall back or skip tests.
how-good-is-the-score
How good is the score?
Query / Target | Score ≈ |
---|---|
identical strings (any case / punctuation) | 1.00 |
same words, swapped order | 0.68 – 0.72 |
one‑word query present once in 45‑token paragraph | \~0.75 |
one‑word query absent from paragraph | ≤ 0.15 |
80‑token paragraph vs same with 1 typo | ≥ 0.90 |
“ciao bella” with +30 random filler tokens appended | \~0.58 |
running-the-test-suite
Running the test suite
mix test
runs a handful of ExUnit cases covering:
- case & punctuation variations
- word‑order permutations
- query present / absent in long paragraph (> 40 tokens)
- very long strings with tiny edits
- monotonic drop as filler tokens grow
All similarity tests auto‑skip if the NIF isn’t loaded (e.g. on CI without Rust).
license
License
MIT License