document_analysis library

Classes

TokenizationOutput
Structure helper for tokenization process

Functions

cosineDistance(List<double> vector1, List<double> vector2) double
Cosine Similarity between two vectors
documentTokenizer(List<String> documentList, {dynamic minLen = 1, String stemmer(String)?, List<String>? stopwords}) TokenizationOutput
Simple document tokenization.
hybridTfIdfMatrix(List<String> documentList, {dynamic measureFunction = cosineDistance, String stemmer(String)?, List<String>? stopwords}) List<List<double?>>
Create word-vector matrix using Hybrid TF-IDF metric
hybridTfIdfProbability(TokenizationOutput tokenOut) Map<String, double>
Word probability calculation - Hybird Term Frequency - Inverse Document Frequency.
hybridTfIdfSimilarity(String document1, String document2, List<String> background, {dynamic distanceFunction = cosineDistance, String stemmer(String)?, List<String>? stopwords}) double
Check similarity between 2 documents using Hybrid TF-IDF metric.
jaccardDistance(List<double> vector1, List<double> vector2, {bool distinctCalculation = false}) double
Jaccard Similarity between two vectors.
tfIdfMatrix(List<String> documentList, {dynamic measureFunction = cosineDistance, String stemmer(String)?, List<String>? stopwords}) List<List<double?>>
Create word-vector matrix using TF-IDF metric
tfIdfProbability(TokenizationOutput tokenOut) List<Map<String, double>>
Word probability calculation - Term Frequency - Inverse Document Frequency.
tfIdfSimilarity(String document1, String document2, List<String> background, {dynamic distanceFunction = cosineDistance, String stemmer(String)?, List<String>? stopwords}) double
Check similarity between 2 documents using TF-IDF metric.
wordFrequencyMatrix(List<String> documentList, {String stemmer(String)?, List<String>? stopwords}) List<List<double>>
Create word-vector matrix using word-frequency metric
wordFrequencyProbability(TokenizationOutput tokenOut) Map<String, double>
Word probability calculation - Word Frequency.
wordFrequencySimilarity(String document1, String document2, {dynamic distanceFunction = cosineDistance, String stemmer(String)?, List<String>? stopwords}) double
Check similarity between 2 documents using word frequency metric