string_search_algorithms 1.0.1
string_search_algorithms: ^1.0.1 copied to clipboard
Fast Dart library for string similarity and substring search with configurable engines, normalization, and caching. Pure Dart and null-safe.
String Search Algorithms #
Simple Dart library for string similarity and substring search. Pure Dart, null-safe, with configurable engines and caching for repeated comparisons.
Highlights #
- Similarity metrics: Levenshtein, Jaro-Winkler, Cosine, Jaccard, and more.
- Search algorithms: KMP, Boyer-Moore, Rabin-Karp, and a standard wrapper.
- Instance-based engines with configurable normalization and caching.
- Compiled patterns for efficient repeated substring searches.
- Extension methods for ergonomic usage on String.
Installation #
Add this to your pubspec.yaml:
dependencies:
string_search_algorithms: ^1.0.1
Quick start #
import 'package:string_search_algorithms/string_search_algorithms.dart';
void main() {
final score = StringSimilarity.compare(
'Dwayne',
'Duane',
algorithm: SimilarityAlgorithm.jaroWinkler,
);
print('Similarity: $score');
final index = StringSearch.indexOf(
'The quick brown fox jumps over the lazy dog',
'brown',
algorithm: SearchAlgorithm.boyerMoore,
);
print('Index: $index');
}
Similarity #
Static helpers #
final score = StringSimilarity.compare(
'kitten',
'sitting',
algorithm: SimilarityAlgorithm.levenshtein,
);
final details = StringSimilarity.compareWithDetails(
'Dwayne',
'Duane',
algorithm: SimilarityAlgorithm.jaroWinkler,
);
Engine with options #
Use StringSimilarityEngine for per-instance configuration and caching.
final engine = StringSimilarityEngine(
options: const SimilarityOptions(
normalization: NormalizationOptions(
toLowerCase: true,
removeAccents: true,
removeSpecialChars: true,
trimWhitespace: true,
),
cache: CacheOptions(
enabled: true,
normalizedCapacity: 1000,
bigramCapacity: 1000,
ngramCapacity: 1000,
),
algorithms: AlgorithmOptions(
ngramSize: 3,
tverskyAlpha: 0.5,
tverskyBeta: 0.5,
jaroWinklerPrefixScale: 0.1,
jaroWinklerBoostThreshold: 0.7,
),
),
);
final score = engine.compare(
'Cafe!',
'cafe',
algorithm: SimilarityAlgorithm.levenshtein,
);
Fuzzy matching #
final candidates = ['apple', 'banana', 'orange', 'grape'];
final matches = StringSimilarity.findMatches(
'appel',
candidates,
minScore: 0.5,
);
for (final match in matches) {
print('${match.value}: ${match.score}');
}
Substring search #
Basic search #
final text = 'The quick brown fox jumps over the lazy dog';
final index = StringSearch.indexOf(
text,
'brown',
algorithm: SearchAlgorithm.boyerMoore,
);
Compiled patterns #
final pattern = StringSearch.compile(
'fox',
algorithm: SearchAlgorithm.kmp,
);
if (pattern.containsIn(text)) {
print('Found!');
}
for (final match in pattern.findAllIn(text)) {
print('Found at ${match.index}');
}
Configuration #
NormalizationOptionscontrols trimming, case folding, accent removal, and custom preprocessors/postprocessors.CacheOptionssizes the normalized string, bigram, and n-gram caches.AlgorithmOptionstunes Jaro-Winkler, N-gram size, and Tversky parameters.
See the API docs for full option details.
Benchmarks #
Benchmark scripts live in benchmark/:
dart run benchmark/similarity_benchmark.dart
dart run benchmark/search_benchmark.dart
API reference #
API docs will be available on pub.dev: https://pub.dev/documentation/string_search_algorithms/latest/
Contributing #
Contributions are welcome. Please read CONTRIBUTING.md and open a PR.
License #
Licensed under the MIT License. See LICENSE.