jaccardDistance<E> function
Returns the Jaccard distance between two list of items.
Parameters
source
is the variant listtarget
is the prototype list
Details
Jaccard distance measures the total number of items that is present in one list
but not the other. It is calculated by subtracting the length of intersection
between the source
and target
set from their union.
Tversky index is a generalization of Jaccard index when alpha = 1, and beta = 1
See Also: tverskyIndex, jaccardIndex
Complexity: Time O(n log n)
| Space O(n)
Implementation
int jaccardDistance<E>(Iterable<E> source, Iterable<E> target) {
Set<E> s = source is Set<E> ? source : source.toSet();
Set<E> t = target is Set<E> ? target : target.toSet();
// calculate intersection between source and target
int intersection = 0;
for (E e in t) {
if (s.contains(e)) intersection++;
}
// calculate union between source and target sets
int union = s.length + t.length - intersection;
// calculate tversky index
return union - intersection;
}