bm25 2.2.1
bm25: ^2.2.1 copied to clipboard
A Dart implementation of the BM25 ranking algorithm for full-text search
Changelog #
2.2.1 #
Bug Fixes #
- Critical: Fixed ReceivePort memory leak that caused "no free native port" errors under heavy load
_initPortis now properly closed after worker initialization- Prevents resource exhaustion in long-running applications
- Critical: Fixed race condition between
dispose()andsearch()operations- Added tracking of active searches to ensure graceful shutdown
dispose()now waits for in-flight searches to complete (with 5s timeout)- Prevents "send on closed port" exceptions and hanging futures
- Added comprehensive test coverage for resource management and concurrent operations
Improvements #
- Better lifecycle management with
_isDisposedflag to prevent operations after disposal - Multiple
dispose()calls are now safe (idempotent) - Enhanced error messages for disposal-related state errors
2.1.0 #
New Features #
- Ultra-fast BM25 implementation: Complete rewrite with significant performance improvements
- Cache-friendly design with gap-encoded postings in single Uint32List
- O(T) build time, O(#postings) query time with tight upper-bound loop
- Lock-free top-K selection using fixed-size min-heap
- Instance-scoped isolate for concurrent searches
- Native metadata filtering: Filter search results by arbitrary metadata fields
- Support for single value and multi-value filters
- Efficient field indexing for fast filtering
- Example:
search('query', filter: {'filePath': 'docs/intro.md'})
- PartitionedBM25: New class for managing per-partition indices
- Create separate indices based on document attributes
- Search within specific partitions or across multiple partitions
- Ideal for large corpora with natural divisions (e.g., per-file indices)
- Improved document handling: BM25Document now includes metadata field
- Store arbitrary key-value pairs with documents
- Use metadata for filtering and partitioning
Improvements #
- Better memory efficiency with typed arrays
- Improved tokenization performance
- Enhanced concurrent search handling
API Changes #
BM25.build()now acceptsindexFieldsparameter for metadata indexingsearch()method now accepts optionalfilterparameter- New
PartitionedBM25class withsearchIn()andsearchMany()methods
2.0.0 #
Breaking Changes #
- BREAKING: Renamed
Documentclass toBM25Documentto avoid naming conflicts with other libraries - BREAKING: Renamed
document.dartfile tobm25_document.dart
Migration Guide #
Update your imports and class references:
// Before
import 'package:bm25/src/document.dart';
Document doc = Document(...);
// After
import 'package:bm25/bm25.dart';
BM25Document doc = BM25Document(...);
1.0.0 #
- Initial release
- Implement BM25 ranking algorithm
- Support for document search and ranking
- Document chunking capabilities
- Configurable BM25 parameters (k1 and b)
- Comprehensive test coverage