What algorithm gives suggestions in a spell checker?

There is good essay by Peter Norvig how to implement a spelling corrector. It’s basicly a brute force approach trying candidate strings with a given edit distance. (Here are some tips how you can improve the spelling corrector performance using a Bloom Filter and faster candidate hashing.)

The requirements for a spell checker are weaker. You have only to find out that a word is not in the dictionary. You can use a Bloom Filter to build a spell checker which consumes less memory. An ancient versions is decribed in Programming Pearls by Jon Bentley using 64kb for an english dictionary.

A BK-Tree is an alternative approach. A nice article is here.

Levenshstein distance is not exactly the right edit distance for a spell checker. It knows only insertion, deletion and substitution. Transposition is missing and produces 2 for a transposition of 1 character (it’s 1 delete and 1 insertion). Damerau–Levenshtein distance is the right edit distance.

Leave a Comment