text_indexing 0.0.1-beta.4 copy "text_indexing: ^0.0.1-beta.4" to clipboard
text_indexing: ^0.0.1-beta.4 copied to clipboard

outdated

Dart library for creating an inverted index on a collection of text documents.

text_indexing #

Dart library for creating an inverted index on a collection of text documents.

THIS PACKAGE IS PRE-RELEASE AND SUBJECT TO DAILY BREAKING CHANGES.

Install #

In the pubspec.yaml of your flutter project, add the following dependency:

dependencies:
  text_indexing: ^0.0.1-beta.4

In your code file add the following import:

import 'package:text_indexing/text_indexing.dart';

Usage #

The text indexing classes (indexers) in this library are intended for information retrieval software applications. The implementation is consistent with information retrieval theory.

The objective is to build and maintain:

  • a term dictionary that holds the vocabulary of terms and the frequency of occurrence for each term in the corpus; and
  • a postings map that holds the references to the documents for each term. In this implementation, our postings include the positions of the term in the document to allow search algorithms to derive relevance.

Indexer Class #

The Indexer is an abstract base class that provides an Indexer.index method that indexes a document, adding a list of term positions to the Indexer.postingsStream for the document. Subclasses of Indexer may override the override Indexer.emit method to perform additional actions whenever a document is indexed.

InMemoryIndexer Class #

The InMemoryIndexer is a subclass of Indexer that builds and maintains in-memory TermDictionary and PostingMap hashmaps. These hashmaps are updated whenever InMemoryIndexer.emit is called at the end of the InMemoryIndexer.index method, so awaiting a call to InMemoryIndexer.index will provide access to the updated InMemoryIndexer.dictionary and InMemoryIndexer.postings collections. The InMemoryIndexer is suitable for indexing and searching smaller collections. An example of the use of InMemoryIndexer is included in the package examples.

PersistedIndexer Class #

The PersistedIndexer is a subclass of Indexer that asynchronously reads and writes a term dictionary and postings map data sources. These data sources are asynchronously updated whenever PersistedIndexer.emit is called at by the PersistedIndexer.index method. The InMemoryIndexer is suitable for indexing and searching large collections but may incur some latency penalty and processing overhead. An example of the use of PersistedIndexer is included in the package examples.

Issues #

If you find a bug please fill an issue.

This project is a supporting package for a revenue project that has priority call on resources, so please be patient if we don't respond immediately to issues or pull requests.

5
likes
0
pub points
49%
popularity

Publisher

verified publishergmconsult.com.au

Dart library for creating an inverted index on a collection of text documents.

Homepage
Repository (GitHub)
View/report issues

License

unknown (license)

Dependencies

meta, porter_2_stemmer, rxdart, text_analysis

More

Packages that depend on text_indexing