TokenCollectionExtension extension

Extension methods on a collection of Token.

on

Properties

allTerms List<String>

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns a list of all the terms from the collection of Tokens, in the same order as they occur in the text.
no setter
terms Set<String>

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns the set of unique terms from the collection of Tokens.
no setter

Methods

byTerm(String term) Iterable<Token>

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Filters the collection for tokens with Token.term == term.
firstPosition(String term) int

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns the lowest Token.termPosition where Token.term == term.
kGrams([int k = 2]) Map<String, Set<String>>

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns a hashmap of k-grams to terms from the collection of tokens.
lastPosition(String term) int

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns the highest Token.termPosition where Token.term == term.
termCount(String term) int

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns the count where Token.term == term.
toKeywordScores() Map<String, double>

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns a mapping of the terms in the collection to their RAKE scores.
toPhrases() Set<List<String>>

Available on Iterable<Token>, provided by the TokenCollectionExtension extension

Returns a list of unique phrases from the terms in the collection.