Cl100kEncoder class
Approximate cl100k_base tokenizer using heuristic BPE-like splitting.
This is not a faithful BPE implementation but produces token counts that are typically within 5-10 % of the real cl100k_base encoder for English prose and source code.
- Inheritance
-
- Object
- TokenEncoder
- Cl100kEncoder
Constructors
- Cl100kEncoder()
-
Factory constructor returning the shared instance.
factory
Properties
- hashCode → int
-
The hash code for this object.
no setterinherited
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
Methods
-
count(
String text) → int -
Returns the number of tokens in
text.override -
decode(
List< int> tokens) → String -
Decodes a list of
tokensback into text.override -
encode(
String text) → List< int> -
Encodes
textinto a list of token IDs.override -
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited
Static Properties
- instance → Cl100kEncoder
-
Singleton instance.
final