Soundex class
Encodes a string to a Soundex value. Soundex is a classic encoding scheme used to compare names that sound similar. It can also be used to find words that sound similar.
While the algorithm is fairly simple, there are several variants, so make sure you know which variant you need if working with existing data. The most notable exceptions are in census data and SQL implementations.
The implementation of this class is unique because it uses a mapping strategy with several configurable behaviors that can be enabled or disabled to support many variants. it's also possible to use a custom mapping for other languages or character sets.
For convenience, there are several static instances available for some of the more common implementations:
- americanEncoder - Implements the standard American Soundex algorithm as described by The National Archives and Records Administration (NARA) (https://www.archives.gov/research/census/soundex.html).
- specialEncoder - This is the same as americanEncoder but the
H
andW
are not ignored like they were supposed to have been. The census data for 1880 through 1910 included standard codes as well as these special codes randomly intermixed. - genealogyEncoder - Implements the rules from the genealogy.com (https://www.genealogy.com/articles/research/00000060.html) website. This is the same as the americanEncoder but ignored characters are not tracked for consonant breaks and are completely ignored instead.
If you want (or need) to understand more details, here are some good references that help explain the history and variants:
- Implemented types
Constructors
- Soundex()
-
Gets the americanEncoder instance of the Soundex encoder by default.
factory
-
Soundex.fromMapping(Map<
int, int> soundexMapping, {bool prefixesEnabled = true, bool hyphenatedPartsEnabled = true, bool ignoreHW = true, bool trackIgnored = true, int maxLength = defaultMaxLength, int paddingChar = $0, bool paddingEnabled = true}) -
Creates a custom Soundex instance. This constructor can be used to
provide custom mappings for non-Western character sets, etc.
factory
Properties
- hashCode → int
-
The hash code for this object.
no setterinherited
- hyphenatedPartsEnabled → bool
-
Indicates that hyphenated parts processing is enabled. When enabled,
any parts that are found are also encoded and returned as alternates.
final
- ignoreHW → bool
-
Indicates if
$H
and$W
should be completely ignored and not mapped at all. This is a special case for some census data.final - maxLength → int
-
Maximum length of the encoding, where
0
indicates no maximum, and how much to pad if paddingEnabled.final - paddingChar → int
-
The character to use for padding (when paddingEnabled is
true
).final - paddingEnabled → bool
-
Indicates if the string will be padded. If
true
, the encoded output will be padded with paddingChar to the length of maxLength.final - prefixesEnabled → bool
-
Indicates that prefix processing is enabled (and will be returned as
PhoneticEncoding._alternate
when available). This also detects the second part of a double barreled name.final - runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
-
soundexMapping
→ Map<
int, int> -
The character mapping to use when encoding. A value of
$nul
means ignore the input character and do not encode it (e.g., vowels).final - trackIgnored → bool
-
Indicates if ignored characters are tracked or completely ignored.
When enabled, ignored characters still act as a separator when it
occurs between consonants, otherwise they are completely ignored.
final
Methods
-
encode(
String input) → PhoneticEncoding? -
Encodes a string using the Soundex algorithm as configured.
override
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited
Static Properties
- americanEncoder → Soundex
-
An instance of Soundex using the americanMapping mapping.
final
- genealogyEncoder → Soundex
-
An instance of Soundex using the americanMapping mapping, but
configured for the special case, and does not ignore H and W.
final
- specialEncoder → Soundex
-
An instance of Soundex using the americanMapping mapping, but
configured for the special case, and does not ignore H and W.
final
Constants
-
americanMapping
→ const Map<
int, int> - This is a default mapping of the 26 letters used in US English.
- defaultMaxLength → const int
- Default encoding length to use.