token_counter 0.3.0 copy "token_counter: ^0.3.0" to clipboard
token_counter: ^0.3.0 copied to clipboard

Token counter for popular LLMs (OpenAI GPT, Anthropic Claude, Google Gemini, Meta Llama) with multi-language support. Pure Dart, works on all Flutter platforms and Dart VM.

0.3.0 #

  • Added SentencePiece unigram-LM tokenizer in pure Dart.
  • New SpProtoReader — minimal protobuf decoder for .model files (no generated code, no native dependencies).
  • New SpUnigramEncoder — Viterbi forward-pass segmentation.
  • New SpVocabLoader abstract class + BytesSpVocabLoader implementation.
  • New TokenCounter.loadSpVocab(SpVocabLoader) — switches the counter to exact SentencePiece mode for gemini, llama, and claude families.
  • TokenCounter.isExact now returns true for both tiktoken BPE and SentencePiece modes.
  • 8 new unit tests (proto parsing, Viterbi segmentation, loadSpVocab API).

0.2.0 #

  • Added exact tiktoken BPE encoder for cl100k_base (GPT-4, GPT-3.5-turbo) and o200k_base (GPT-4o, GPT-4.1, o-series) vocabulary families.
  • New TokenCounter.loadVocab(TiktokenVocabLoader) — supply the raw .tiktoken file bytes from any source (Flutter assets, local file, in-memory bytes) to switch the counter into exact mode.
  • New TiktokenVocabLoader abstract class + BytesVocabLoader implementation.
  • New TiktokenVocabParser for parsing the .tiktoken line format.
  • New TiktokenSpecialTokens constants for cl100k_base and o200k_base.
  • TokenCounter.isExact property to distinguish heuristic from BPE mode.
  • 13 new unit tests for the BPE algorithm (merge correctness, special tokens, pre-tokenization, loadVocab API).

0.1.1 #

  • Rewrote README in English with accurate, runnable code examples.
  • Updated roadmap to mark v0.1 as complete.

0.1.0 #

Initial public release.

  • Pure-Dart heuristic token estimator for OpenAI (GPT-4o / GPT-4 / o-series), Anthropic Claude 3–4, Google Gemini 1.5 / 2, and Meta Llama 3 / 3.1 / 3.3.
  • Unicode script classifier covering Latin, CJK, Hiragana, Katakana, Hangul, Arabic, Cyrillic, Devanagari, Thai, emoji, and more.
  • TokenCounter.estimate, TokenCounter.forModel, countMessages, and estimateCost public API.
  • Bundled per-model pricing table for cost estimation.
  • 22 unit tests covering multilingual inputs, per-provider chat overhead, and cost math.
0
likes
0
points
243
downloads

Publisher

verified publisherwarashibetech.com

Weekly Downloads

Token counter for popular LLMs (OpenAI GPT, Anthropic Claude, Google Gemini, Meta Llama) with multi-language support. Pure Dart, works on all Flutter platforms and Dart VM.

Repository (GitHub)
View/report issues

Topics

#llm #tokenizer #openai #anthropic #gemini

License

unknown (license)

More

Packages that depend on token_counter