String Search Algorithms
Simple Dart library for string similarity and substring search. Pure Dart, null-safe, with configurable engines and caching for repeated comparisons.
Highlights
- Similarity metrics: Levenshtein, Jaro-Winkler, Cosine, Jaccard, and more.
- Search algorithms: KMP, Boyer-Moore, Rabin-Karp, and a standard wrapper.
- Instance-based engines with configurable normalization and caching.
- Compiled patterns for efficient repeated substring searches.
- Extension methods for ergonomic usage on String.
Installation
Add this to your pubspec.yaml:
dependencies:
string_search_algorithms: ^1.0.1
Quick start
import 'package:string_search_algorithms/string_search_algorithms.dart';
void main() {
final score = StringSimilarity.compare(
'Dwayne',
'Duane',
algorithm: SimilarityAlgorithm.jaroWinkler,
);
print('Similarity: $score');
final index = StringSearch.indexOf(
'The quick brown fox jumps over the lazy dog',
'brown',
algorithm: SearchAlgorithm.boyerMoore,
);
print('Index: $index');
}
Similarity
Static helpers
final score = StringSimilarity.compare(
'kitten',
'sitting',
algorithm: SimilarityAlgorithm.levenshtein,
);
final details = StringSimilarity.compareWithDetails(
'Dwayne',
'Duane',
algorithm: SimilarityAlgorithm.jaroWinkler,
);
Engine with options
Use StringSimilarityEngine for per-instance configuration and caching.
final engine = StringSimilarityEngine(
options: const SimilarityOptions(
normalization: NormalizationOptions(
toLowerCase: true,
removeAccents: true,
removeSpecialChars: true,
trimWhitespace: true,
),
cache: CacheOptions(
enabled: true,
normalizedCapacity: 1000,
bigramCapacity: 1000,
ngramCapacity: 1000,
),
algorithms: AlgorithmOptions(
ngramSize: 3,
tverskyAlpha: 0.5,
tverskyBeta: 0.5,
jaroWinklerPrefixScale: 0.1,
jaroWinklerBoostThreshold: 0.7,
),
),
);
final score = engine.compare(
'Cafe!',
'cafe',
algorithm: SimilarityAlgorithm.levenshtein,
);
Fuzzy matching
final candidates = ['apple', 'banana', 'orange', 'grape'];
final matches = StringSimilarity.findMatches(
'appel',
candidates,
minScore: 0.5,
);
for (final match in matches) {
print('${match.value}: ${match.score}');
}
Substring search
Basic search
final text = 'The quick brown fox jumps over the lazy dog';
final index = StringSearch.indexOf(
text,
'brown',
algorithm: SearchAlgorithm.boyerMoore,
);
Compiled patterns
final pattern = StringSearch.compile(
'fox',
algorithm: SearchAlgorithm.kmp,
);
if (pattern.containsIn(text)) {
print('Found!');
}
for (final match in pattern.findAllIn(text)) {
print('Found at ${match.index}');
}
Configuration
NormalizationOptionscontrols trimming, case folding, accent removal, and custom preprocessors/postprocessors.CacheOptionssizes the normalized string, bigram, and n-gram caches.AlgorithmOptionstunes Jaro-Winkler, N-gram size, and Tversky parameters.
See the API docs for full option details.
Benchmarks
Benchmark scripts live in benchmark/:
dart run benchmark/similarity_benchmark.dart
dart run benchmark/search_benchmark.dart
API reference
API docs will be available on pub.dev: https://pub.dev/documentation/string_search_algorithms/latest/
Contributing
Contributions are welcome. Please read CONTRIBUTING.md and open a PR.
License
Licensed under the MIT License. See LICENSE.
Libraries
- search
- Substring search APIs and algorithms.
- similarity
- String similarity and fuzzy matching APIs.
- string_search_algorithms
- High-performance string similarity and substring search algorithms for Dart.