TokenCollectionExtension extension

Extension methods on a collection of Token.

on

Properties

allTerms List<String>
Returns a list of all the terms from the collection of Tokens, in the same order as they occur in the text.
no setter
terms Set<String>
Returns the set of unique terms from the collection of Tokens.
no setter

Methods

byTerm(String term) Iterable<Token>
Filters the collection for tokens with Token.term == term.
firstPosition(String term) int
Returns the lowest Token.termPosition where Token.term == term.
kGrams([int k = 2]) Map<String, Set<String>>
Returns a hashmap of k-grams to terms from the collection of tokens.
lastPosition(String term) int
Returns the highest Token.termPosition where Token.term == term.
termCount(String term) int
Returns the count where Token.term == term.
toKeywordScores() Map<String, double>
Returns a mapping of the terms in the collection to their RAKE scores.
toPhrases() Set<List<String>>
Returns a list of unique phrases from the terms in the collection.