text_comparison_score_codespark 1.0.0 copy "text_comparison_score_codespark: ^1.0.0" to clipboard
text_comparison_score_codespark: ^1.0.0 copied to clipboard

String similarity for Dart & Flutter. Calculate match percentages using Levenshtein, Damerau-Levenshtein, and Jaro-Winkler. Supports fuzzy matching and case-insensitive comparison.

Banner

text_comparison_score_codespark #

Calculate string similarity, text comparison scores, match percentages, fuzzy matching results, and string distance metrics using Levenshtein Distance, Damerau-Levenshtein Distance, Jaro-Winkler, and other text comparison algorithms in Dart and Flutter.

Built by Katayath Sai Kiran · @Katayath-Sai-Kiran

pub.dev version pub points pub likes MIT License Platform: Flutter String Similarity

Screenshots #

Demo — score overview
Score overview
Demo — algorithm comparison
Algorithm comparison
Demo — edge cases
Edge cases

Features #

  • Levenshtein Distance: Calculates the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other.
  • Damerau-Levenshtein Distance: Extends Levenshtein distance by treating adjacent character transpositions (e.g. "teh""the") as a single edit, producing more accurate scores for real-world typos.
  • Jaro-Winkler Distance: Measures the similarity between two strings, taking into account the number of matching characters and transpositions, with a boost for common prefixes.
  • Match Percentage: Returns the match percentage between two strings, indicating how similar they are.
  • Case Sensitivity Option: Allows optional case sensitivity in string comparisons.
  • Multiple Algorithms: Choose between different algorithms — Levenshtein, Damerau-Levenshtein, and Jaro-Winkler — for your comparison needs.

Use Cases #

  • String similarity detection
  • Text similarity analysis
  • Fuzzy string matching
  • Fuzzy search
  • String comparison
  • Text comparison
  • Match percentage calculation
  • Confidence score generation
  • Typo detection
  • Search suggestions
  • Duplicate record matching
  • Name matching
  • Data validation

Installation #

Add the following to your pubspec.yaml:

dependencies:
  text_comparison_score_codespark: ^1.0.0

Then run:

flutter pub get

Usage #

Here's how to use the TextComparisonScore class to calculate the match percentage between two strings using different algorithms:

import 'package:text_comparison_score_codespark/text_comparison_score_codespark.dart';

void main() {
  // Example 1: Simple Levenshtein comparison
  String string1 = "kitten";
  String string2 = "sitting";

  double matchPercent = TextComparisonScore.calculateScore(string1, string2, algorithm: ComparisonAlgorithm.levenshtein);
  print("Levenshtein Match Percentage between '$string1' and '$string2': $matchPercent%");

  // Example 2: Jaro-Winkler comparison
  double jaroMatchPercent = TextComparisonScore.calculateScore(string1, string2, algorithm: ComparisonAlgorithm.jaroWinkler);
  print("Jaro-Winkler Match Percentage between '$string1' and '$string2': $jaroMatchPercent%");

  // Example 3: Identical strings
  String identical1 = "flutter";
  String identical2 = "flutter";

  double identicalMatchPercent = TextComparisonScore.calculateScore(identical1, identical2);
  print("Match Percentage between identical strings '$identical1' and '$identical2': $identicalMatchPercent%");

  // Example 4: Case insensitive comparison
  String caseSensitive1 = "Hello";
  String caseSensitive2 = "hello";

  double caseSensitiveMatchPercent = TextComparisonScore.calculateScore(caseSensitive1, caseSensitive2, caseSensitive: false);
  print("Match Percentage between '$caseSensitive1' and '$caseSensitive2' (case insensitive): $caseSensitiveMatchPercent%");

  // Example 5: Damerau-Levenshtein — better typo handling via transpositions
  // "teh" vs "the" is a single transposition; standard Levenshtein counts it as 2 edits
  String typo = "teh";
  String correct = "the";

  double dlMatchPercent = TextComparisonScore.calculateScore(typo, correct, algorithm: ComparisonAlgorithm.damerauLevenshtein);
  double levMatchPercent = TextComparisonScore.calculateScore(typo, correct, algorithm: ComparisonAlgorithm.levenshtein);
  print("Damerau-Levenshtein Match Percentage between '$typo' and '$correct': $dlMatchPercent%");
  print("Levenshtein Match Percentage between '$typo' and '$correct': $levMatchPercent%");
}

Example Output #

- **Levenshtein Match Percentage between** `'kitten'` **and** `'sitting'`: `57.14285714285714%`
- **Jaro-Winkler Match Percentage between** `'kitten'` **and** `'sitting'`: `74.74%`
- **Match Percentage between identical strings** `'flutter'` **and** `'flutter'`: `100.0%`
- **Match Percentage between** `'Hello'` **and** `'hello'` **(case insensitive)**: `100.0%`
- **Damerau-Levenshtein Match Percentage between** `'teh'` **and** `'the'`: `66.67%`
- **Levenshtein Match Percentage between** `'teh'` **and** `'the'`: `33.33%`

Future Updates #

In future versions, this package will include:

  1. Cosine Similarity: Measures the cosine of the angle between two vectors, which can be used for similarity between text strings.
  2. Soundex: A phonetic algorithm for indexing names by sound, as pronounced in English.
  3. Hamming Distance: Measures the number of differing bits between two binary strings.
  4. Normalized Distance Measures: Provides normalized versions of distance metrics to return values between 0 and 1.
  5. String Tokenization & N-grams: Support for splitting strings into tokens and analyzing n-grams.
  6. Customizable Weighting: Allows users to assign custom weights to different types of edits.
  7. Multi-Language Support: Ensures that algorithms work with various character sets and languages.
  8. Threshold-based Matching: Returns whether the match percentage is above a user-defined threshold.
  9. Performance Optimization for Large Texts: Implements efficient data structures and parallel processing to handle large texts.
  10. Batch Comparison: Allows users to compare a single string against a batch of other strings, returning the most similar ones.
  11. Detailed Comparison Report: Provides a detailed report with multiple similarity metrics between two strings.
  12. API for Custom Comparison Functions: Enables users to define and plug in their custom comparison functions.

Maintainer #

Developed with 💙 by Katayath Sai Kiran Feel free to contribute or suggest improvements!

8
likes
160
points
1.42k
downloads
screenshot

Documentation

API reference

Publisher

verified publisherksaikiran.dev

Weekly Downloads

String similarity for Dart & Flutter. Calculate match percentages using Levenshtein, Damerau-Levenshtein, and Jaro-Winkler. Supports fuzzy matching and case-insensitive comparison.

Repository (GitHub)
View/report issues

Topics

#string-similarity #fuzzy-matching #string-comparison #text-similarity #levenshtein-distance

License

MIT (license)

Dependencies

flutter

More

Packages that depend on text_comparison_score_codespark