collections/bk_tree_utils library

BK-tree for approximate string matching — roadmap #493.

A BK-tree indexes strings under a metric distance (default: Damerau–Levenshtein) so that "all words within edit distance k of a query" can be answered without comparing the query to every word. It relies on the triangle inequality: if a stored word w is distance d from the query, any acceptable match must be a child of w reached by an edge whose label lies in [d - k, d + k], so whole subtrees outside that band are pruned.

The distance function MUST be a true metric (non-negative, symmetric, satisfies the triangle inequality) for the pruning to stay correct; Damerau–Levenshtein and Levenshtein both qualify.

Classes

BkTree
An approximate-match index over a set of strings.

Typedefs

StringDistance = int Function(String a, String b)
Distance between two strings; must be a metric for correct pruning.