diceIndexOf function
Finds the Sørensen–Dice coefficient of two strings.
Parameters
source
is the variant stringtarget
is the prototype string- if
ignoreCase
is true, the character case shall be ignored. - if
ignoreWhitespace
is true, space, tab, newlines etc whitespace characters will be ignored. - if
ignoreNumbers
is true, numbers will be ignored. - if
alphaNumericOnly
is true, only letters and digits will be matched. ngram
is the size a single item group. If n = 1, each individual items are considered separately. If n = 2, two consecutive items are grouped together and treated as one.
TIPS: You can pass both
ignoreNumbers
andalphaNumericOnly
to true to ignore everything else except letters.
Details
Sørensen–Dice coefficient is a metric used to measure similarity between two samples. This is known by several other names:
- Sørensen index
- Dice's coefficient
- Dice similarity coefficient (DSC)
Tversky index is a generalization of Dice index when alpha = 0.5, and beta = 0.5
See Also: tverskyIndex, diceIndex
Complexity: Time O(n log n)
| Space O(n)
Implementation
double diceIndexOf(
String source,
String target, {
int ngram = 1,
bool ignoreCase = false,
bool ignoreWhitespace = false,
bool ignoreNumbers = false,
bool alphaNumericOnly = false,
}) {
return tverskyIndexOf(
source,
target,
alpha: 0.5,
beta: 0.5,
ngram: ngram,
ignoreCase: ignoreCase,
ignoreWhitespace: ignoreWhitespace,
ignoreNumbers: ignoreNumbers,
alphaNumericOnly: alphaNumericOnly,
);
}