jaccardDistance<E> function

int jaccardDistance<E>(
  1. Iterable<E> source,
  2. Iterable<E> target
)

Returns the Jaccard distance between two list of items.

Parameters

  • source is the variant list
  • target is the prototype list

Details

Jaccard distance measures the total number of items that is present in one list but not the other. It is calculated by subtracting the length of intersection between the source and target set from their union.

Tversky index is a generalization of Jaccard index when alpha = 1, and beta = 1

See Also: tverskyIndex, jaccardIndex


Complexity: Time O(n log n) | Space O(n)

Implementation

int jaccardDistance<E>(Iterable<E> source, Iterable<E> target) {
  Set<E> s = source is Set<E> ? source : source.toSet();
  Set<E> t = target is Set<E> ? target : target.toSet();

  // calculate intersection between source and target
  int intersection = 0;
  for (E e in t) {
    if (s.contains(e)) intersection++;
  }

  // calculate union between source and target sets
  int union = s.length + t.length - intersection;

  // calculate tversky index
  return union - intersection;
}