jaroWinklerSimilarityOf function
Find the Jaro-Winkler Similarity index between two strings.
Parameters
source
andtarget
are two strings.threshold
is the minimum Jaro distance above which the Winkler's increment is to be applied.maxPrefixSize
is the maximum prefix length to consider. If absent, the whole matching prefix is considered.prefixScale
is the prefix scale - a constant scaling factor for how much the score is adjusted upwards for having common prefixes. The length of the considered common prefix is at most 4. If absent, the default prefix scale is used.- if
ignoreCase
is true, the character case shall be ignored. - if
ignoreWhitespace
is true, space, tab, newlines etc whitespace characters will be ignored. - if
ignoreNumbers
is true, numbers will be ignored. - if
alphaNumericOnly
is true, only letters and digits will be matched.
Details
The Jaro similarity index between two strings is the weighted sum of percentage of matched characters from each string and transposed characters. Winkler increased this measure for matching initial characters.
See Also: jaroWinklerSimilarity
If n
is the length of source
and m
is the length of target
,
Complexity: Time O(nm)
| Space O(n+m)
Implementation
double jaroWinklerSimilarityOf(
String source,
String target, {
int? maxPrefixSize,
double? prefixScale,
double threshold = 0.7,
bool ignoreCase = false,
bool ignoreWhitespace = false,
bool ignoreNumbers = false,
bool alphaNumericOnly = false,
}) {
source = cleanupString(
source,
ignoreCase: ignoreCase,
ignoreWhitespace: ignoreWhitespace,
ignoreNumbers: ignoreNumbers,
alphaNumericOnly: alphaNumericOnly,
);
target = cleanupString(
target,
ignoreCase: ignoreCase,
ignoreWhitespace: ignoreWhitespace,
ignoreNumbers: ignoreNumbers,
alphaNumericOnly: alphaNumericOnly,
);
return jaroWinklerSimilarity(
source.codeUnits,
target.codeUnits,
threshold: threshold,
prefixScale: prefixScale,
maxPrefixSize: maxPrefixSize,
);
}