text_indexing library
Dart library for creating an inverted index on a collection of text documents.
Classes
- AsyncCallbackIndex
- The AsyncCallbackIndex is a InvertedIndex implementation class that extends AsyncCallbackIndexBase.
- AsyncCallbackIndexBase
- Base class implementation of InvertedIndex with AsyncCallbackIndexMixin.
- AsyncCallbackIndexMixin
- A mixin class that implements InvertedIndex. The mixin exposes five callback function fields that must be overriden:
- English
- A TextAnalyzer implementation for English language analysis.
- InMemoryIndex
- The InMemoryIndex is an implementation of the InvertedIndex interface that extends InMemoryIndexBase.
- InMemoryIndexBase
- Base class implementation of InvertedIndex with InMemoryIndexMixin.
- InMemoryIndexMixin
- A mixin class that implements InvertedIndex. The mixin exposes in-memory dictionary and postings fields that must be overriden.
- InvertedIndex
- An interface that exposes methods for working with an inverted, positional zoned index on a collection of documents.
- LatinLanguageAnalyzer
- A TextAnalyzer implementation for Latin languages analysis.
- NGramRange
- Enumerates a range of N-gram sizes (minimum and maximum length).
- Porter2Stemmer
- DART implementation of the Porter Stemming Algorithm (see https://snowballstem.org/algorithms/), used for reducing a word to its word stem, base or root form.
- SimilarityIndex
- Object model for a suggestion as alternate for a term. Used in spelling correction and term expansion.
- TermCoOccurrenceGraph
- A RAKE co-occurrence graph for evaluating the score of keywords extracted from text.
- TermCoOccurrenceGraphBase
-
Base class that implements TermCoOccurrenceGraph and mixes in
TermCoOccurrenceGraphMixin
. - TermSimilarity
- A static/abstract class that exposes methods for computing similarity of terms.
- TextAnalyzer
- An interface exposes language-specific properties and methods used in text analysis.
- TextDocument
- The TextDocument object model enumerates properties for analysing a text document:
- TextIndexer
-
Interface for classes that construct and maintain a InvertedIndex for a
collection of documents (
corpus
). - TextIndexerMixin
- Mixin class implementation of the TextIndexer interface.
- Token
- A Token represents a term (word) present in a text source:
Enums
- PartOfSpeech
-
In grammar, a
part-of-speech
is a category of words that have similar grammatical properties. - PoSTag
-
Part of speech tags are used in natural language processing as part of
Part-of-Speech tagging
.