TermSplitter typedef

TermSplitter = List<String> Function(SourceText source)

A splitter function that returns a list of terms from source.

Typically, source is split at punctuation marks and white-space.

The term splitter should avoid splitting numbers, which may contain period marks or other punctuation delimited phrases such as domain names and identifiers and hyphenated words.

If the TermSplitter preserves punctuation delimited phrases, the tokenizer that uses the TermSplitter can include both the preserved/delimited term as well as its components as separate tokens.

Implementation

typedef TermSplitter = List<String> Function(SourceText source);