analyze static method
Future<TextDocument>
analyze({
- required String sourceText,
- required TextAnalyzer analyzer,
- TokenFilter? tokenFilter,
- NGramRange? nGramRange,
- String? zone,
Hydrates a TextDocument from the sourceText
, zone
and
analyzer
parameters:
sourceText
is all the analysed text in the document;zone
is the name to be used for all tokens extracted from thesourceText
;nGramRange
is the range of N-gram lengths to generate; andanalyzer
is a TextAnalyzer used to split thesourceText
into paragraphs, sentences, terms and nGrams in thenGramRange
and to extract the keywords in thesourceText
to a TermCoOccurrenceGraph. The static factory uses aanalyzer
to tokenize thesourceText
and populate the tokens property.
Implementation
static Future<TextDocument> analyze(
{required String sourceText,
required TextAnalyzer analyzer,
TokenFilter? tokenFilter,
NGramRange? nGramRange,
String? zone}) async {
final tokens = await analyzer.tokenizer(sourceText,
zone: zone, tokenFilter: tokenFilter, nGramRange: nGramRange);
final terms = analyzer.termSplitter(sourceText);
final nGrams = terms.nGrams(nGramRange ?? NGramRange(1, 2));
final sentences = analyzer.sentenceSplitter(sourceText);
final paragraphs = analyzer.paragraphSplitter(sourceText);
final keywords = tokens.toPhrases();
final graph = TermCoOccurrenceGraph(keywords);
final syllableCount = terms.map((e) => analyzer.syllableCounter(e)).sum;
return _TextDocumentImpl(sourceText, null, tokens, paragraphs, sentences,
terms, nGrams, syllableCount, graph);
}