text_indexing 1.0.0 text_indexing: ^1.0.0 copied to clipboard
Dart library for creating an inverted index on a collection of text documents.
1.0.0 #
- Stable release
0.23.0 #
BREAKING CHANGES
Breaking changes #
This is a major re-work of the library with a significant simplification of the interfaces:
- Interface
TextTokenizer
removed. UseTextAnalyzer.tokenize
andTextAnalyzer.tokenizeJson
in stead. - Mixin
InvertedIndexMixin
removed. - Instance method
InvertedIndex.getFtdPostings
removed, use static methodInvertedIndex.ftdPostingsFromPostings
in stead. - Instance method
InvertedIndex.getIdFtIndex
removed, use static methodInvertedIndex.idFtIndexFromDictionary
in stead. - Instance method
InvertedIndex.getTfIndex
removed, use static methodInvertedIndex.tfIndexFromPostings
in stead. - Instance method
InvertedIndex.
removed, use static methodInvertedIndex.
in stead. - Extension methods
Iterable<DftMapEntry>.sort
andIterable<DftMapEntry>.toList
removed. - Property
InvertedIndex.tokenFilter
removed. - Class
TextIndexerBase
removed. - Interface method
TextIndexer.indexDocumentStream
added and implemented inTextIndexerMixin
. - Interface method
TextIndexer.indexCollectionStream
added and implemented inTextIndexerMixin
. - Factories
TextIndexer
,TextIndexer.stream
andTextIndexer.collectionStream
removed. - Signatures changed of interface methods
TextIndexer.indexText
,TextIndexer.indexJson
andTextIndexer.indexCollection
. - Interface method
TextIndexer.dispose()
added. - Enum
TermSortStrategy
removed. - Enum
TokenizingStrategy
removed. - Interface
TextIndexer
implemented inInvertedIndex
. - Changed signature of
TextIndexer.indexText
. - Changed signature of
TextIndexer.indexDocumentStream
. - Changed signature of
TextIndexer.indexJson
. - Changed signature of
TextIndexer.indexCollectionStream
.
New #
- Added
InMemoryIndexBase
andAsyncCallbackIndexBase
totext_indexing
library exports.
Bug fix #
- Fixed keyword postings in indexer.
Updated #
- Dependencies.
- Tests.
- Documentation
- Examples.
0.23.0-5 #
0.23.0-3 #
0.23.0-2 #
0.23.0-1 #
BREAKING CHANGES
Breaking changes #
This is a major re-work of the library with a significant simplification of the interfaces:
- Interface
TextTokenizer
removed. UseTextAnalyzer.tokenize
andTextAnalyzer.tokenizeJson
in stead. - Mixin
InvertedIndexMixin
removed. - Instance method
InvertedIndex.getFtdPostings
removed, use static methodInvertedIndex.ftdPostingsFromPostings
in stead. - Instance method
InvertedIndex.getIdFtIndex
removed, use static methodInvertedIndex.idFtIndexFromDictionary
in stead. - Instance method
InvertedIndex.getTfIndex
removed, use static methodInvertedIndex.tfIndexFromPostings
in stead. - Instance method
InvertedIndex.
removed, use static methodInvertedIndex.
in stead. - Extension methods
Iterable<DftMapEntry>.sort
andIterable<DftMapEntry>.toList
removed. - Property
InvertedIndex.tokenFilter
removed. - Class
TextIndexerBase
removed. - Interface method
TextIndexer.indexDocumentStream
added and implemented inTextIndexerMixin
. - Interface method
TextIndexer.indexCollectionStream
added and implemented inTextIndexerMixin
. - Factories
TextIndexer
,TextIndexer.stream
andTextIndexer.collectionStream
removed. - Signatures changed of interface methods
TextIndexer.indexText
,TextIndexer.indexJson
andTextIndexer.indexCollection
. - Interface method
TextIndexer.dispose()
added. - Enum
TermSortStrategy
removed. - Enum
TokenizingStrategy
removed. - Interface
TextIndexer
implemented inInvertedIndex
.
Updated #
- Dependencies.
- Tests.
- Documentation
- Examples.
0.22.4+15 #
Deprecated #
- Interface
TextTokenizer
is deprecated and will be removed from the next stable version oftext_analysis
library. At that timetext_indexer
will be updated to accomodate the change and issued as version 0.23.0.
0.22.4+13 #
Updated #
- Bumped dependency
text_analysis
to ver0.23.7+12
. - Changed
InvertedIndex.nGramRange
to nullable.
0.22.4+12 #
Updated #
- Bumped dependency
text_analysis
to ver0.23.7+11
. - Changed algo for extension method
JSON.toSourceText
.
0.22.2 #
0.22.0 #
Breaking changes #
- Added method
InvertedIndex.getCollectionSize
. - Implemented
InvertedIndex.getCollectionSize
. - Implemented
AsyncCallbackIndex.getCollectionSize
. - Renamed function definition
VocabularyLength
toCollectionSizeCallback
. - Changed signature of factory
InvertedIndex.inMemory
. - Changed signature of unnamed factory constructor
InvertedIndex
.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.21.1 #
0.21.0 #
Breaking changes #
- Added field
TokenizingStrategy InvertedIndex.strategy
. - Added field
InvertedIndex.keywordExtractor
. - Added method
InvertedIndex.getKeywordPostings
. - Added method
InvertedIndex.upsertKeywordPostings
. - Changed signature of method
TextIndexer.updateIndexes
. - Changed signature of default
InMemoryIndex
constructor. - Changed signature of default
AsyncCallbackIndex
constructor. - Changed signature of default
InvertedIndex
factory constructor. - Changed signature of default
InvertedIndex.inMemory
factory constructor.
New #
- Added typedef
KeywordPostingsMap
. - Added typedef
KeyWordPostings
. - Added function definition
KeywordPostingsMapLoader
. - Added function definition
KeywordPostingsMapUpdater
. - Added base class
InMemoryIndexBase
. - Implemented field
AsyncCallbackIndex.strategy
. - Implemented field
InMemoryIndex.strategy
. - Implemented
InMemoryIndex.getKeywordPostings
. - Implemented
InMemoryIndex.upsertKeywordPostings
. - Implemented
InMemoryIndex.keywordExtractor
. - Implemented
AsyncCallbackIndex.getKeywordPostings
. - Implemented
AsyncCallbackIndex.upsertKeywordPostings
. - Implemented
AsyncCallbackIndex.keywordExtractor
.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.19.0 #
Breaking changes #
- Changed signature of
TextIndexer
default unnamed factory constructor. - Removed field
TextIndexer.documentStream
. - Removed field
TextIndexer.collectionStream
. - Added field
InvertedIndex.nGramRange
. - Changed the signature of
InvertedIndex
unnamed factory. - Changed the signature of
InvertedIndex.inMemory
factory. - Changed the signature of
AsyncCallbackIndex
default constructor. - Changed the signature of
InMemoryIndex
default constructor. - Removed field
InvertedIndex.phraseLength
.
New #
- Added factory constructor
TextIndexer.collectionStream
. - Added factory constructor
TextIndexer.stream
. - Changed
TextIndexer.indexText
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.16.0 #
Breaking changes #
- Default k-gram length changed from k =2 to k = 2 in
AsyncCallbackIndex
andInMemoryIndex
constructors and
New #
- Unnamed factory constructor
InvertedIndex
returns a [AsyncCallbackIndex] instance. - Factory constructor
InvertedIndex.inMemory
returns a [InMemoryIndex] instance.
Updated #
- Dependencies.
- Tests.
- Documentation.
0.15.0 #
Breaking changes #
- Renamed the following typedefs:
Dictionary
toDftMap
;DictionaryEntry
toDftMapEntry
;DictionaryLoader
toDftMapLoader
;DictionaryUpdater
toDftMapUpdater
;DictionaryLengthLoader
toVocabularySize
;KGramIndex
toKGramsMap
;KGramIndexLoader
toKGramsMapLoader
;KGramIndexUpdater
toKGramsMapUpdater
;Postings
toPostingsMap
;PostingsEntry
toPostingsMapEntry
;PostingsLoader
toPostingsMapLoader
;PostingsUpdater
toPostingsMapUpdater
;FieldPostingsEntry
toZonePostingsMapEntry
;ZonePostings
toZonePostingsMap
;DocumentPostingsEntry
toDocPostingsMapEntry
; andDocumentPostings
toDocPostingsMap
.
- Removed
HiveIndex
from thetest
folder. - Removed
_asyncIndexerExample
from the `example folder. - Renamed the
text_indexing_extensions
mini-library toextensions
. - Renamed the
text_indexing_type_definitions
mini-library totype_definitions
.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.14.7 #
0.14.0 #
Breaking changes #
- Removed class
TextSource
. - Removed class
Sentence
. - Removed class
TermPair
. - Removed
TextAnalyzerConfiguration.sentenceSplitter
fromTextAnalyzerConfiguration
interface. - Changed
TextTokenizer.tokenize
return value toList<Token>
. - Changed
TextTokenizer.tokenizeJson
return value toList<Token>
. - Re-structured codebase. \
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.13.0 #
0.12.0+1 #
Updated dependencies and documentation.
0.12.0 #
BREAKING CHANGES
Breaking changes #
- Added method
InvertedIndex.getKGramIndex
toInvertedIndex
interface. - Added method
InvertedIndex.upsertKGramIndex
toInvertedIndex
interface. - Added field
InvertedIndex.k
toInvertedIndex
interface. - Removed field
TextIndexer.postingsStream
. - Renamed method
TextIndexer.emit
toTextIndexer.updateIndexes
. - Added
AsyncIndex.k
,AsyncIndex.kGramIndexLoader
andAsyncIndex.kGramIndexUpdater
final fields and parameters toAsyncIndex
class. - Added
InMemoryIndex.k
, andInMemoryIndex.kGramIndex
final fields and parameters toInMemoryIndex
class.
New: #
- Type alias
KGramIndex
. - Type alias
KGramIndexLoader
. - Type alias
KGramIndexUpdater
. - Extension method
void KGramIndex.addTermKGrams(Term term, Iterable<KGram> kGrams)
.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.11.0 #
0.10.0 #
Breaking changes #
TextIndexerBase
default generative constructor is no longer markedconst
as it has a method body that initializes listeners toTextIndexer.documentStream
andTextIndexer.collectionStream
.
New: #
- Input stream fields
TextIndexer.documentStream
andTextIndexer.collectionStream
added toTextIndexer
interface.- - Optional named parameter
Stream<Map<String, Map<String, dynamic>>>? collectionStream
added to added toTextIndexer.async
,TextIndexer.inMemory
andTextIndexer.index
factory contructors.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.9.0 #
Breaking changes #
- Renamed
InvertedPositionalZoneIndex
interface toInvertedIndex
. - Renamed
TextIndexer.instance
factory toTextIndexer.index
. - Parameter
dictionaryLengthLoader
added toAsynCallbackIndex
constructor; - Parameter
dictionaryLengthLoader
added toAsyncIndexer
constructor; - Parameter
dictionaryLengthLoader
added toTextIndexer.async
factory constructor; - Removed class
InMemoryIndexer
, use factory constructorTextIndexer.inMemory
in stead. - Removed class
AsyncIndexer
, use factory constructorTextIndexer.async
in stead.
New: #
- Type definition
FtdPostings
. - Type definition
IdFtIndex
. - Type definition
IdFt
. - Type definition
ZoneWeightMap
. - Field getter
Future<int> InvertedIndex.vocabularyLength
. - Field getter
Future<int> Function() AsynCallbackIndex.dictionaryLengthLoader
; - Field getter
int InvertedIndex.phraseLength
. - Field getter
ZoneWeightMap InvertedIndex.zones
. - Optional named parameter
ZoneWeightMap zones
added toTextIndexer.async
factory. - Optional named parameter
ZoneWeightMap zones
added toTextIndexer.inMemory
factory. - Method
Future<FtdPostings> InvertedIndex.getFtdPostings(Iterable<Term>, int)
. - Method
Future<IdFtIndex> InvertedIndex.getIdFtIndex(Iterable<Term>)
. - Method
Future<Dictionary> InvertedIndex.getTfIndex(Iterable<Term>)
.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.8.0+1 #
Updated dependencies
0.8.0 #
0.7.2+1 #
Updated dependencies
0.7.2 #
Updated dependencies
0.7.1 #
Updated dependencies
0.7.0 #
0.6.0 #
BREAKING CHANGES
Breaking changes #
- Changed signature of extension method
Postings.termPostingsList(Term)
toPostings.termPostingsList([Iterable<Term>?])
. - Removed field
InMemoryIndexer.dictionary
. UseInMemoryIndexer.index.dictionary
instead. - Removed field
InMemoryIndexer.postings
. UseInMemoryIndexer.index.postings
instead. - Removed method
TextIndexer.upsertDictionary
. UseTextIndexer.index.upsertDictionary
instead; - Removed method
TextIndexer.getDictionary
. UseTextIndexer.index.getDictionary
instead; - Removed method
TextIndexer.getPostings
. UseTextIndexer.index.getPostings
instead; - Removed method
TextIndexer.upsertPostings
. UseTextIndexer.index.upsertPostings
instead. - Removed field
InMemoryIndexer.dictionary
. Useindex.dictionary
instead. - Removed field
InMemoryIndexer.postings
. Useindex.postings
instead. - Added new field
InvertedIndex.analyzer
, changing the signatures of factory constructorsTextIndexer.inMemory
and 'TextIndexer.async'.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.6.0-2 #
0.5.0 #
BREAKING CHANGES
Deprecated:
- Field
InMemoryIndexer.dictionary
is deprecated. Useindex.dictionary
instead. - Field
InMemoryIndexer.postings
is deprecated. Useindex.postings
instead.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.4.0 #
BREAKING CHANGES
Breaking changes #
- Renamed method
TextIndexer.index
toTextIndexer.indexText
. - Renamed class
PersistedIndexer
toAsyncIndexer
.
New: #
InvertedIndex
interface and implementation.TextIndexer.index
field getter.TextIndexer.index
factory constructor.TextIndexer.async
factory constructor.TextIndexer.inMemory
factory constructor.
Deprecated:
- Method
TextIndexer.upsertDictionary
is deprecated. UseTextIndexer.index.upsertDictionary
instead; - Method
TextIndexer.getDictionary
is deprecated. UseTextIndexer.index.getDictionary
instead; - Method
TextIndexer.getPostings
is deprecated. UseTextIndexer.index.getPostings
instead; - Method
TextIndexer.upsertPostings
is deprecated. UseTextIndexer.index.upsertPostings
instead. - Field
InMemoryIndexer.dictionary
is deprecated. Useindex.dictionary
instead. - Field
InMemoryIndexer.postings
is deprecated. Useindex.postings
instead.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.3.2 #
0.3.1 #
0.2.0 #
BREAKING CHANGES
New: #
ZonePostings
,DocumentPostings
, andFieldPostingsEntry
type definitions.Ft
,Pt
,TermPositions
andDocId
type aliases.- interface
Document
.
Breaking changes #
- Replaced object-model class
PostingsEntry
with typedefPostingsEntry
. - Replaced object-model class
DocumentPostingsEntry
with typedefDocumentPostingsEntry
. - Replaced object-model class
DictionaryEntry
with typedefDictionaryEntry
.
Restructured and simplified the codebase.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.1.0 #
0.0.2 #
0.0.1+6 #
0.0.1 #
BREAKING CHANGES
Interfaces finalized (see breaking changes)
Breaking changes #
TermDictionary
renamedDictionary
.DocumentPostingsEntry
renamedPostings
.PostingsMapEntry
renamedPostingsEntry
.Term
renamedDictionaryEntry
.TermPositions
renamedDocumentPostingsEntry
.AsyncIndexer
implementation.TextIndexerBase
implementation.InMemoryIndexer
implementation.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.0.1-beta.3 #
0.0.1-beta.2 #
0.0.1-beta.1 #
Initial version.