utils/tokens/token_counter library
Token counting and estimation utilities ported from Neomage TypeScript.
Provides approximate tokenization compatible with cl100k_base (Neomage/GPT-4), token budgets, cost estimation, and context window management.
Classes
- Cl100kEncoder
- Approximate cl100k_base tokenizer using heuristic BPE-like splitting.
- ContextWindow
- Represents the token capacity of a model's context window.
- CostEstimate
- Estimated cost for a single API call.
- ModelPricing
- Per-token pricing for a single model in USD.
- ModelPricingTable
- Known model pricing constants (as of early 2025).
- TokenBudget
- Tracks a token budget with reservation support.
- TokenCounter
- High-level token counting, truncation, splitting, and cost estimation.
- TokenEncoder
- Abstract interface for text tokenizers.
Functions
-
estimateTokens(
String text) → int - Quick heuristic token estimate: roughly 1 token per 4 characters for English text.