lastlegume

The site of all of the projects made by lastlegume.

31 October 2023

A Simple, Inefficient Algorithm for Fuzzy Text Matching

by lastlegume


Note: As of January 2026, fuzzy string matching on this site is done by calculating edit distance (more specifically Levenshtein distance). The method presented here is much worse and was simply my first thought when approaching the problem several years ago, so it has many flaws.

Exactly as the title states: a simple, javascript based code segment for matching two strings while allowing for some inaccuracy. Because of the method of matching, this segment is O(N^3), so it should not be used if time is an issue. The code is shown below and an interactive example are below:

Code (Javascript)

function fuzzy(guess, answer) {
    // normalizes the format of the strings (optional)
    guess = guess.toLowerCase().trim();
    answer = answer.toLowerCase().trim();
    // checks if the strings are equals beforehand to skip the iteration if unnecessary
    if (guess === answer)
        return true;
    // variable that holds the number of correct characters
    let score = 0;
    // number of characters back or forwards a substring is allowed to be before being counted as nonexistent.
    var leniencyFactor = 0.5;
    var leniency = Math.ceil(answer.length ** leniencyFactor);
    // factor that controls the amount that can be wrong
    var fuzzyFactor = -0.45;
    var fuzziness = 1 - answer.length ** fuzzyFactor;
    // maximum possible score (subtracts the length of answer because one character substrings are not being checked)
    var neededFuzzyAmount = ((answer.length) * (answer.length + 1) * (answer.length + 2)) / 6 - answer.length;
    //checks substrings of lengths started at the length of the guess to a length of 2 
    for (let i = guess.length; i >= 2; i--) {
        //moves through the guess string to find all substrings of length i
        for (let s = 0; s <= (guess.length - i); s++) {
            //checks the substrings of guess with equal length substrings in answer from the same index - leniency to the same index + leniency
            for (let a = Math.max(0, s - leniency); a <= Math.min(s + leniency, answer.length - i); a++) {
                // if the substrings are equal, adds the number of correct characters (the length of the substring) to the score
                if (guess.substring(s, s + i) === answer.substring(a, a + i)) {
                    score += i;
                }
            }
            // if a high enough score is reached, it skips the rest of the program
            if (score > neededFuzzyAmount ** fuzziness)
                return true;
        }
    }
    return score > neededFuzzyAmount ** fuzziness;
}

Try It Yourself

Guess: Answer:


Use default exponential models (uncheck if you want to manually set leniency and fuzziness)

Fuzzy Factor:

Factor controlling the amount that can be wrong (almost anything matches when at 0, gets stricter as it decreases)

Value:

Leniency Factor:

Factor that affects the leniency (most lenient at 1, least lenient at 0)

Value:


tags: cs - javascript