analyze static method

Future<TextDocument> analyze({
  1. required String sourceText,
  2. required TextAnalyzer analyzer,
  3. TokenFilter? tokenFilter,
  4. NGramRange? nGramRange,
  5. String? zone,
})

Hydrates a TextDocument from the sourceText, zone and analyzer parameters:

  • sourceText is all the analysed text in the document;
  • zone is the name to be used for all tokens extracted from the sourceText;
  • nGramRange is the range of N-gram lengths to generate; and
  • analyzer is a TextAnalyzer used to split the sourceText into paragraphs, sentences, terms and nGrams in the nGramRange and to extract the keywords in the sourceText to a TermCoOccurrenceGraph. The static factory uses a analyzer to tokenize the sourceText and populate the tokens property.

Implementation

static Future<TextDocument> analyze(
    {required String sourceText,
    required TextAnalyzer analyzer,
    TokenFilter? tokenFilter,
    NGramRange? nGramRange,
    String? zone}) async {
  final tokens = await analyzer.tokenizer(sourceText,
      zone: zone, tokenFilter: tokenFilter, nGramRange: nGramRange);
  final terms = analyzer.termSplitter(sourceText);
  final nGrams = terms.nGrams(nGramRange ?? NGramRange(1, 2));
  final sentences = analyzer.sentenceSplitter(sourceText);
  final paragraphs = analyzer.paragraphSplitter(sourceText);
  final keywords = tokens.toPhrases();
  final graph = TermCoOccurrenceGraph(keywords);
  final syllableCount = terms.map((e) => analyzer.syllableCounter(e)).sum;
  return _TextDocumentImpl(sourceText, null, tokens, paragraphs, sentences,
      terms, nGrams, syllableCount, graph);
}