tokenizeKeyphrases function
Splits text into lowercase alphanumeric tokens, dropping stopwords.
Punctuation and casing are normalized away so 'Cats, cats!' yields a single repeated term. One-character tokens are dropped as noise.
Example:
tokenizeKeyphrases('The quick Brown fox'); // ['quick', 'brown', 'fox']
Audited: 2026-06-12 11:26 EDT
Implementation
List<String> tokenizeKeyphrases(String text) => text
.toLowerCase()
.split(RegExp(r'[^a-z0-9]+'))
.where((String t) => t.length > 1 && !_kStopwords.contains(t))
.toList();