Crate rust_fuzzy_search[][src]

This crate implements Fuzzy Searching with trigrams

Fuzzy searching allows to compare strings by similarity rather then by equality:
Similar strings will get a high score (close to 1.0f32) while dissimilar strings will get a lower score (closer to 0.0f32).

Fuzzy searching tolerates changes in word order:
ex. "John Dep" and "Dep John" will get a high score.

The crate exposes 4 main functions:

The Algorithm used is taken from : https://dev.to/kaleman15/fuzzy-searching-with-postgresql-97o

Basic idea:

  1. From both strings extracts all groups of 3 adjacent letters.
    ("House" becomes [' H', ' Ho', 'Hou', 'ous', 'use', 'se ']).
    Note the 2 spaces added to the head of the string and the one on the tail, used to make the algorithm work on zero length words.

  2. Then counts the number of trigrams of the first words that are also present on the second word and divide by the number of trigrams of the first word.

Example: Comparing 2 strings

fn test () {
   use rust_fuzzy_search::fuzzy_compare;
   let score : f32 = fuzzy_compare("kolbasobulko", "kolbasobulko");
   println!("score = {:?}", score);
}
Run

Example: Comparing a string with a list of strings and retrieving only the best matches

fn test() {
    use rust_fuzzy_search::fuzzy_search_best_n;
    let s = "bulko";
    let list : Vec<&str> = vec![
        "kolbasobulko",
        "sandviĉo",
        "ŝatas",
        "domo",
        "emuo",
        "fabo",
        "fazano"
    ];
    let n : usize = 3;
    let res : Vec<(&str, f32)> = fuzzy_search_best_n(s,&list, n);
    for (_word, score) in res {
        println!("{:?}",score)
    }
}
Run

Functions

fuzzy_compare

Use this function to compare 2 strings.

fuzzy_search

Use this function to compare a string (&str) with all elements of a list.

fuzzy_search_best_n

This function is similar to fuzzy_search_sorted but keeps only the n best items, those with a better match.

fuzzy_search_sorted

This function is similar to fuzzy_search but sorts the result in descending order (the best matches are placed at the beginning).

fuzzy_search_threshold

This function is similar to fuzzy_search but filters out element with a score lower than the specified one.