string/text_fingerprint_utils library
Text fingerprinting (order-sensitive word-hash) — roadmap #417.
NOTE: this is NOT a simhash. It XORs each word's full hash (mixed with its
position) into a single value, which is good for an equality/identity
fingerprint but is NOT locality-preserving — a single word change flips many
bits, so fingerprintDistance (a Hamming bit-count) is NOT a similarity
measure. Use it for "are these two texts identical?", not "how similar?".
The per-word hash is Dart's String.hashCode, which is deterministic within
a run but not guaranteed stable across Dart versions/platforms; do not
persist fingerprints for long-term cross-build comparison.
Functions
-
fingerprintDistance(
int a, int b) → int - Hamming distance between two 32-bit fingerprints (number of differing bits). Audited: 2026-06-12 11:26 EDT
-
textFingerprint(
String text) → int -
Simple 32-bit fingerprint: hash of word shingles.
textsplit on non-letters. Audited: 2026-06-12 11:26 EDT