collections/bk_tree_utils library
BK-tree for approximate string matching — roadmap #493.
A BK-tree indexes strings under a metric distance (default:
Damerau–Levenshtein) so that "all words within edit distance k of a
query" can be answered without comparing the query to every word. It relies
on the triangle inequality: if a stored word w is distance d from the
query, any acceptable match must be a child of w reached by an edge whose
label lies in [d - k, d + k], so whole subtrees outside that band are
pruned.
The distance function MUST be a true metric (non-negative, symmetric, satisfies the triangle inequality) for the pruning to stay correct; Damerau–Levenshtein and Levenshtein both qualify.
Classes
- BkTree
- An approximate-match index over a set of strings.
Typedefs
- StringDistance = int Function(String a, String b)
- Distance between two strings; must be a metric for correct pruning.