dart_bert_tokenizer 1.0.2
dart_bert_tokenizer: ^1.0.2 copied to clipboard
A lightweight, pure Dart implementation of BERT WordPiece tokenizer. 100% compatible with HuggingFace tokenizers.
Changelog #
1.0.2 #
Added #
- Project configuration files (.gitignore)
- Updated .pubignore for cleaner package distribution
1.0.1 #
Added #
- Comprehensive dartdoc comments for all public APIs
- .pubignore for cleaner package distribution
1.0.0 #
- Initial release
- Pure Dart implementation of BERT WordPiece tokenizer
- 100% HuggingFace tokenizers compatibility
- Memory-efficient typed arrays (Int32List, Uint8List)
- Single text and sentence pair encoding
- Batch encoding (sequential and parallel with Isolates)
- Padding and truncation support
- Offset mapping (char-to-token, token-to-char, word-to-tokens)
- Vocabulary access and token conversion utilities