TextDocument class abstract
The TextDocument object model enumerates properties for analysing a text document:
- sourceText is all the analysed text in the document. The text from a JSON document's (analysed) fields is joined with line ending marks;
- paragraphs is a list of strings after splitting sourceText at line ending marks;
- sentences is a list of strings after splitting sourceText at sentence ending punctuation and line ending marks;
- nGrams is a collection of word sequences generated from the terms;
- zones is a collection of the names of the zones in document that are tokenized.
- terms is all the words in the sourceText;
- keywords is the keywords in the document mapped to their RAKE keyword score in a TermCoOccurrenceGraph;
- syllableCount is the total number of syllables in the document; and
- tokens is all the tokens extracted from sourceText.
The following functions return analysis results from the sourceText statistics:
- averageSentenceLength is the average number of words in sentences;
- averageSyllableCount is the average number of syllables per word in terms;
- wordCount the total number of words in the sourceText;
- fleschReadingEaseScore is a readibility measure calculated from sentence length and word length on a 100-point scale. The higher the score, the easier it is to understand the document;
- fleschKincaidGradeLevel is a readibility measure relative to U.S. school grade level. It is also calculated from sentence length and word length .
- Implementers
Constructors
-
TextDocument({required String sourceText, required List<
Token> tokens, required List<String> paragraphs, required List<String> sentences, required List<String> terms, required List<String> nGrams, required TermCoOccurrenceGraph keywords, required int syllableCount, List<String> ? zones}) -
Hydrates a const TextDocument from the document properties.
factory
Properties
- hashCode → int
-
The hash code for this object.
no setterinherited
- keywords → TermCoOccurrenceGraph
-
The unique keywords in the document mapped to their RAKE keyword score
in a TermCoOccurrenceGraph.
no setter
-
nGrams
→ List<
String> -
A collection of n-grams from the terms in the document.
no setter
-
paragraphs
→ List<
String> -
All the paragraphs in the sourceText.
no setter
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
-
sentences
→ List<
String> -
All the sentences in the sourceText.
no setter
- sourceText → String
-
Returns the source text associated with the document.
no setter
- syllableCount → int
-
The total number of syllables in the document.
no setter
-
terms
→ List<
String> -
All the words in the sourceText.
no setter
-
tokens
→ List<
Token> -
The tokens extracted from sourceText.
no setter
-
zones
→ Iterable<
String> ? -
A collection of the names of the zones in document that are to be
tokenized.
no setter
Methods
-
averageSentenceLength(
) → int - The average number of words in sentences.
-
averageSyllableCount(
) → double - The average number of syllables per word in terms.
-
fleschKincaidGradeLevel(
) → int -
Returns the readability score of sourceText on a U.S. school grade
level (
Flesch-Kincaid Grade Level test
). -
fleschReadingEaseScore(
) → double - Returns the Flesch reading ease score of the sourceText on a 100-point scale.
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toString(
) → String -
A string representation of this object.
inherited
-
wordCount(
) → int - The number of words in the sourceText.
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited
Static Methods
-
analyze(
{required String sourceText, required TextAnalyzer analyzer, TokenFilter? tokenFilter, NGramRange? nGramRange, String? zone}) → Future< TextDocument> -
Hydrates a TextDocument from the
sourceText
,zone
andanalyzer
parameters: -
analyzeJson(
{required Map< String, dynamic> document, required TextAnalyzer analyzer, NGramRange? nGramRange, TokenFilter? tokenFilter, Iterable<String> ? zones}) → Future<TextDocument> -
Hydrates a TextDocument from the
document
,zones
andanalyzer
parameters. The static factory: