Data, Maps, Usability, and Performance

Better Typos Generator Based on Common Misspellings

Last updated on December 3, 2012 in Development

common typos

Previously, I wrote about generating typos based on misplacing your finger on a keyboard, or proximity based misspelling. But, another way to think about typos is to consider the words that people generally misspell. If you could query a list of misspelled words, it would be pretty easy to search and replace correct word or words with typos. Generally, the use case is for the reverse, correcting someone’s typos, but in some cases I think you could get some extra SEO juice from generating a couple typos for your main keywords. So, the idea is that I will pass a phrase, perhaps a title of some blog entry, and a function will return an array of phases with common misspellings of different words in the phrase. But first, let’s start with a list of common typos.

If you browse around the internet, there are some lists and even websites that offer common misspellings. The problem is that they are often short and they do not have the correct version of the word properly associated with the misspelled word.

I did find a pdf that has 12,000 most frequently used and misspelled words but I don’t like the format and it does not have the correct spelling of each word. Ideally, we want a json or an array in text file format. So, I have created it and uploaded both versions to Github. Check out this text file that contains 11,795 common typos and correct version of each word. Finally, here is the more advanced Online Typo Generator with source code on GitHub.

Related links:

typo.js – A client-side JavaScript spellchecker that uses Hunspell-style dictionaries.

Tags: , ,

Facebook Twitter Hacker News Reddit More...