Cl100kEncoder class

Approximate cl100k_base tokenizer using heuristic BPE-like splitting.

This is not a faithful BPE implementation but produces token counts that are typically within 5-10 % of the real cl100k_base encoder for English prose and source code.

Inheritance

Constructors

Cl100kEncoder()
Factory constructor returning the shared instance.
factory

Properties

hashCode int
The hash code for this object.
no setterinherited
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

count(String text) int
Returns the number of tokens in text.
override
decode(List<int> tokens) String
Decodes a list of tokens back into text.
override
encode(String text) List<int>
Encodes text into a list of token IDs.
override
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited

Static Properties

instance Cl100kEncoder
Singleton instance.
final