TiktokenEncoder class

Low level Tiktoken encoder/decoder. It exposes more detailed APIs for processing text using tokens.

Constructors

TiktokenEncoder({required String name, required String patternStr, required Map<ByteArray, int> mergeableRanks, required Map<String, int> specialTokens, int? explicitNVocab})
Instead of using this constructor, consider using the static helper functions Tiktoken.getEncoder and Tiktoken.getEncoderForModel.

Properties

eotToken int?
no setter
explicitNVocab int?
The number of tokens in the vocabulary. If provided, it is checked that the number of mergeable tokens and special tokens is equal to this number.
final
hashCode int
The hash code for this object.
no setterinherited
maxTokenValue int
latefinal
mergeableRanks Map<ByteArray, int>
A dictionary mapping mergeable token bytes to their ranks. The ranks must correspond to merge priority.
final
name String
The name of the encoding.
final
patternStr String
A regex pattern string that is used to split the input text.
final
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
specialTokens Map<String, int>
A dictionary mapping special token strings to their token values.
final
specialTokensSet Set<String>
A set of special tokens keys
latefinal

Methods

decode(List<int> tokens, {bool allowMalformed = true}) String
Decodes a list of tokens into a string.
decodeBytes(List<int> tokens) Uint8List
Decodes a list of tokens into bytes. Example:
decodeSingleTokenBytes(int token) Uint8List
Decodes a token into bytes.
decodeTokenBytes(List<int> tokens) List<Uint8List>
Decodes a list of tokens into a list of bytes.
encode(String text, {SpecialTokensSet allowedSpecial = const SpecialTokensSet.empty(), SpecialTokensSet disallowedSpecial = const SpecialTokensSet.all()}) Uint32List
Encodes a string into tokens.
encodeOrdinary(String text) Uint32List
Encodes a string into tokens, ignoring special tokens.
encodeSingleToken(List<int> bytes) int
Encodes text corresponding to a single token to its token value.
encodeWithUnstable(String text, {SpecialTokensSet allowedSpecial = const SpecialTokensSet.empty(), SpecialTokensSet disallowedSpecial = const SpecialTokensSet.all()}) → (List<int>, Set<List<int>>)
Encodes a string into stable tokens and possible completion sequences.
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
tokenByteValues() List<Uint8List>
Returns sortedTokenBytes from underlying CoreBPE tokenizer.
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited