jaroWinklerSimilarityOf function

double jaroWinklerSimilarityOf(
  1. String source,
  2. String target, {
  3. int? maxPrefixSize,
  4. double? prefixScale,
  5. double threshold = 0.7,
  6. bool ignoreCase = false,
  7. bool ignoreWhitespace = false,
  8. bool ignoreNumbers = false,
  9. bool alphaNumericOnly = false,
})

Find the Jaro-Winkler Similarity index between two strings.

Parameters

  • source and target are two strings.
  • threshold is the minimum Jaro distance above which the Winkler's increment is to be applied.
  • maxPrefixSize is the maximum prefix length to consider. If absent, the whole matching prefix is considered.
  • prefixScale is the prefix scale - a constant scaling factor for how much the score is adjusted upwards for having common prefixes. The length of the considered common prefix is at most 4. If absent, the default prefix scale is used.
  • if ignoreCase is true, the character case shall be ignored.
  • if ignoreWhitespace is true, space, tab, newlines etc whitespace characters will be ignored.
  • if ignoreNumbers is true, numbers will be ignored.
  • if alphaNumericOnly is true, only letters and digits will be matched.

Details

The Jaro similarity index between two strings is the weighted sum of percentage of matched characters from each string and transposed characters. Winkler increased this measure for matching initial characters.

See Also: jaroWinklerSimilarity


If n is the length of source and m is the length of target,
Complexity: Time O(nm) | Space O(n+m)

Implementation

double jaroWinklerSimilarityOf(
  String source,
  String target, {
  int? maxPrefixSize,
  double? prefixScale,
  double threshold = 0.7,
  bool ignoreCase = false,
  bool ignoreWhitespace = false,
  bool ignoreNumbers = false,
  bool alphaNumericOnly = false,
}) {
  source = cleanupString(
    source,
    ignoreCase: ignoreCase,
    ignoreWhitespace: ignoreWhitespace,
    ignoreNumbers: ignoreNumbers,
    alphaNumericOnly: alphaNumericOnly,
  );
  target = cleanupString(
    target,
    ignoreCase: ignoreCase,
    ignoreWhitespace: ignoreWhitespace,
    ignoreNumbers: ignoreNumbers,
    alphaNumericOnly: alphaNumericOnly,
  );
  return jaroWinklerSimilarity(
    source.codeUnits,
    target.codeUnits,
    threshold: threshold,
    prefixScale: prefixScale,
    maxPrefixSize: maxPrefixSize,
  );
}