A collection of phonetic algorithms. These algorithms help find words or names that sound similar by generating an encoding that can be compared or indexed for fuzzy searching.
Dart Phonetics #
Algorithms Implemented #
- Soundex - A highly configurable implementation of the Soundex algorithm. There are better algorithms available, but this algorithm is classic, and is required when analyzing American surnames in genealogy or census data.
- Refined Soundex - The refined soundex is a variation that is better for applications such as spell checking. It uses a mapping that aims to be more precise and does not truncate to 4 characters by default.
- NYSIIS - An implementation of the New York State Identification and Intelligence System as documented by the USDA SRS system design report. The modified version of the algorithm is best suited for encoding names.
- Double Metaphone - The metaphone series of algorithms apply "expert rules" based on inconsistencies in the English language in attempt to acheive greater precision (fewer results that are closer in phonetic sound).
Work In Progress #
This project is a work in progress that is being developed because I need these algorithms for another project. I'll spend time implementing more phonetic algorithms depending on demand, need, or community interest.
Sponsor Me #
Please consider sponsoring me if you are using this library, need help, or if you want to discuss specific algorithms or need a special encoding algorithm implemented.
Other Implementations #
The Wikipedia Phonetic Algorithm page provides a good basic background. There are several other libraries (written in other languages) that may be useful for reference to those interested in exploring various Phonetic Encoding algorithms for various purposes. These references are also useful because of the test cases they contain (which capture edge cases that are useful to analyze and compare).
- Apache Commons Codes is a Java library that includes many Phonetic algoritms
- Abydos is a Python library that includes many Phonetic algorithms
- stringmetric is a Scala library that includes many Phonetic algorithms.