core/chat library
Classes
- InferenceChat
- StopTokenFilter
-
Filters stop tokens from model response stream.
For .litertlm on iOS, MediaPipe doesn't handle
<end_of_turn>— this filter detects and terminates the stream at the stop token, with buffering for partial tag matches.