BPE

bpe provides fast BPE tokenizers for cl100k_base and o200k_base, along with chunking helpers for working with large strings and Stream<String> inputs.

It is useful when you want to:

encode text into token IDs
decode token IDs back into text
estimate token usage for large documents without processing the whole document as one giant string
split text into cleaner chunks that prefer natural boundaries like newlines, spaces, and periods

Available Tokenizers

CL100kBaseBPETokenizer()
O200kBaseBPETokenizer()

Basic Usage

import 'package:bpe/bpe.dart';

void main() {
  BPETokenizer tokenizer = O200kBaseBPETokenizer();
  String text = 'Hello from bpe.';

  List<int> tokens = tokenizer.encode(text);
  String decoded = tokenizer.decode(tokens);

  print(tokens);
  print(decoded);
}

Estimating Tokens For Large Text

estimateTokens() is the easiest way to get a stream-based estimate for a large string.

Instead of treating the whole input as a single block, it chunks the text first and then estimates from those chunks.

import 'package:bpe/bpe.dart';

Future<void> main() async {
  BPETokenizer tokenizer = CL100kBaseBPETokenizer();
  String document = 'Very large text here...' * 5000;

  int estimatedTokens = await tokenizer.estimateTokens(document);

  print('Estimated tokens: $estimatedTokens');
}

This is a good fit for:

prompt budgeting
long articles
books
logs
scraped content
ingestion pipelines

Estimating Tokens From A Stream

If your text already arrives in pieces, use estimateTokensStream().

import 'package:bpe/bpe.dart';

Future<void> main() async {
  BPETokenizer tokenizer = O200kBaseBPETokenizer();
  Stream<String> source = Stream<String>.fromIterable([
    'Page one content.\n',
    'Page two content.\n',
    'Page three content.\n',
  ]);

  int estimatedTokens = await tokenizer.estimateTokensStream(source);

  print('Estimated tokens: $estimatedTokens');
}

This helps when reading from:

files
sockets
HTTP streams
paginated fetches
chunked parsing pipelines

Chunking Helpers

The package also exposes chunking extensions on both String and Stream<String>.

These helpers try to avoid ugly hard splits by preferring boundaries in this order by default:

newline
space
period

`String.chunk()`

Use this when you already have a full string and want a Stream<String> of cleaner chunks.

import 'package:bpe/bpe.dart';

Future<void> main() async {
  String text = '''
This is a long paragraph.
It has multiple lines and sentences.
Chunking tries to cut at natural boundaries when it can.
''';

  await for (String chunk in text.chunk(size: 40, grace: 20)) {
    print('---');
    print(chunk);
  }
}

`Stream<String>.cleanChunks()`

Use this when your incoming stream is already fragmented in awkward places and you want to normalize it into cleaner output chunks.

import 'package:bpe/bpe.dart';

Future<void> main() async {
  Stream<String> messyStream = Stream<String>.fromIterable([
    'Hello wo',
    'rld. This is a str',
    'eam that was split badly.\nNext li',
    'ne starts here.',
  ]);

  await for (String chunk in messyStream.cleanChunks(size: 30, grace: 10)) {
    print('---');
    print(chunk);
  }
}

This is especially useful when upstream data is split arbitrarily and you want chunk boundaries that are more readable and safer for token estimation.

Chunking Options

The chunking APIs share the same knobs:

size: preferred chunk size
grace: extra room allowed while waiting for a better split point
splitPriority: preferred split markers in order

Example:

import 'package:bpe/bpe.dart';

Future<void> main() async {
  String text = 'A very long string...' * 200;

  await for (String chunk in text.chunk(
    size: 1200,
    grace: 300,
    splitPriority: ['\n', ' ', '.', ','],
  )) {
    print(chunk.length);
  }
}

When To Use Which API

Use encode() when you need the actual token IDs.
Use decode() when you need the text back from token IDs.
Use estimateTokens() when you already have one large string.
Use estimateTokensStream() when text arrives over time as a stream.
Use String.chunk() when you want clean chunk boundaries from one string.
Use Stream<String>.cleanChunks() when you want to repair or normalize a fragmented text stream before further processing.

Libraries

bpe
tiktoken/src/common/byte_array
tiktoken/src/common/special_tokens_set
tiktoken/src/common/utils
tiktoken/src/core_bpe
tiktoken/src/core_bpe_constructor
tiktoken/src/error/tiktoken_error
tiktoken/src/ranks/cl100k_base.tiktoken
tiktoken/src/ranks/cl100k_base/cl100k_base_1.g
tiktoken/src/ranks/cl100k_base/cl100k_base_2.g
tiktoken/src/ranks/cl100k_base/cl100k_base_3.g
tiktoken/src/ranks/cl100k_base/cl100k_base_4.g
tiktoken/src/ranks/cl100k_base/cl100k_base_5.g
tiktoken/src/ranks/cl100k_base/cl100k_base_6.g
tiktoken/src/ranks/cl100k_base/cl100k_base_7.g
tiktoken/src/ranks/cl100k_base/cl100k_base_8.g
tiktoken/src/ranks/cl100k_base/cl100k_base_9.g
tiktoken/src/ranks/cl100k_base/cl100k_base_10.g
tiktoken/src/ranks/cl100k_base/cl100k_base_11.g
tiktoken/src/ranks/index
tiktoken/src/ranks/o200k_base.tiktoken
tiktoken/src/ranks/o200k_base/o200k_base_1
tiktoken/src/ranks/o200k_base/o200k_base_2
tiktoken/src/ranks/o200k_base/o200k_base_3
tiktoken/src/ranks/o200k_base/o200k_base_4
tiktoken/src/ranks/o200k_base/o200k_base_5
tiktoken/src/ranks/o200k_base/o200k_base_6
tiktoken/src/ranks/o200k_base/o200k_base_7
tiktoken/src/ranks/o200k_base/o200k_base_8
tiktoken/src/ranks/o200k_base/o200k_base_9
tiktoken/src/ranks/o200k_base/o200k_base_10
tiktoken/src/ranks/o200k_base/o200k_base_11
tiktoken/src/ranks/o200k_base/o200k_base_12
tiktoken/src/ranks/o200k_base/o200k_base_13
tiktoken/src/ranks/o200k_base/o200k_base_14
tiktoken/src/ranks/o200k_base/o200k_base_15
tiktoken/src/ranks/o200k_base/o200k_base_16
tiktoken/src/ranks/o200k_base/o200k_base_17
tiktoken/src/ranks/o200k_base/o200k_base_18
tiktoken/src/ranks/o200k_base/o200k_base_19
tiktoken/src/ranks/o200k_base/o200k_base_20
tiktoken/src/tiktoken_encoder
tiktoken/src/word_counter
tiktoken/tiktoken_tokenizer_gpt4o_o1

BPE