udpipe/src/conllu_parser library

Classes

SepVerb
A German separable verb detected in a sentence (e.g. aussteigen split into particle aus + verb stem steigt).
UDToken
A single token from a CoNLL-U sentence produced by UDPipe.

Functions

findSepVerbs(List<UDToken> tokens) List<SepVerb>
Finds all separated (trennbar) verbs in a token list.
parseConllu(String conllu) List<UDToken>
Parses a single-sentence CoNLL-U string into a flat list of UDTokens.
parseConlluSentences(String conllu) List<({String text, List<UDToken> tokens})>
Parses a batch CoNLL-U string into per-sentence records.
tokensByFormAll(List<UDToken> tokens) Map<String, List<UDToken>>
Builds a map from surface form → all tokens with that form, in order.