characterSimilarity method
Returns the similarity between the collection of letters of this String and
other
on a scale of 0.0 to 1.0.
Compares the characters in this and other
by splitting each string into
a set of its unique characters and finding the intersection between the
two sets of characters.
- Returns 1.0 if the two Strings are the same (this.trim().toLowerCase() == other.trim().toLowerCase()).
- Returns 1.0 if the character set for this and the intersection have the
same length AND this and
other
are the same length. - Returns 0.0 if the intersection is empty (no shared characters).
- Returns the intersection length divided by the average length of the two character sets multiplied by the length similarity.
The String and other
are converted to lower-case and trimmed for the
comparison.
Not case-sensitive.
Implementation
double characterSimilarity(String other) {
final term = trim().toLowerCase();
other = other.trim().toLowerCase();
if (term == other) return 1.0;
final thisChars = term.split('').toSet();
final otherChars = other.trim().toLowerCase().split('').toSet();
final intersection = thisChars.intersection(otherChars);
final lengthSimilarity = term.lengthSimilarity(other);
if (intersection.length == thisChars.length &&
term.lengthDistance(other) == 0) {
return 1.0;
}
return (intersection.length * 2 / (thisChars.length + otherChars.length)) *
lengthSimilarity;
}