Searchlight
Searchlight is an independent pure Dart reimplementation of Orama's in-memory search and indexing model for Dart and Flutter apps. It gives you schema-based indexing, scoring, filtering, facets, persistence, and tokenizer control without requiring a server.
Searchlight is especially useful when your app already has content available locally or can download and cache it, and you want fast in-app search over that data.
Status
searchlight is the core package: indexing, querying, persistence,
tokenizer configuration, and a limited create-time extension surface.
Current extension support includes:
- ordered
SearchlightPluginregistration - lifecycle hooks via top-level
SearchlightPluginfields - component replacement via
SearchlightComponents - restore-time validation that a persisted snapshot is loaded with a compatible plugin/component graph
It does not currently include:
- PDF parsing or rendering
- Flutter UI widgets
The extension API is intentionally narrower today. Searchlight does not yet expose async plugin initialization or every declared hook dispatch path, but it now supports more than index/sorter replacement.
Companion Packages
Current companion packages:
searchlight_highlightfor text highlighting, excerpts, HTML<mark>output, andPositionmatch rangessearchlight_parsedocfor HTML and Markdown extraction plus population helpers
PDF extraction, viewer integration, and other source-format-specific ingestion still belong in your app or in future companion packages above the core library.
Platform Support
searchlight is a pure Dart package. It works anywhere Dart runs, including
Flutter mobile, desktop, and web. The core package does not include
platform-channel code or platform-specific subpackages.
Start Here
- Read doc/app-integration.md for the recommended app architecture.
- Read doc/validation-workflow.md for the canonical repository validation sequence.
- Open example/README.md for the Flutter validation app.
What It Provides
- Full-text indexing for structured documents
- BM25, QPS, and PT15 ranking algorithms
- Typed filters, sorting, grouping, and facets
- JSON and CBOR persistence for cached indexes
- Standalone tokenizer utilities with language support, stemming, and optional stop words
- A create-time extension API for lifecycle hooks and component replacement
Install companion packages when you need them:
searchlight_highlightfor snippets, marked ranges, and HTML<mark>outputsearchlight_parsedocfor Markdown and HTML extraction before indexing
Searchlight.create() also exposes tokenizer-related configuration for the
built-in database tokenizer, including stemming, stemmer, stopWords,
useDefaultStopWords, allowDuplicates, tokenizeSkipProperties, and
stemmerSkipProperties.
By default, stemming is off. Built-in tokenizer settings round-trip through
persistence. Injected Tokenizer instances and custom stemmer callbacks do not
serialize.
Installation
dart pub add searchlight
# or from a Flutter app
flutter pub add searchlight
Quick Start
import 'package:searchlight/searchlight.dart';
Future<void> main() async {
final db = Searchlight.create(
schema: Schema({
'url': const TypedField(SchemaType.string),
'title': const TypedField(SchemaType.string),
'content': const TypedField(SchemaType.string),
'type': const TypedField(SchemaType.enumType),
}),
);
db.insert({
'id': 'ember-lance',
'url': '/spells/ember-lance',
'title': 'Ember Lance',
'content': 'A focused lance of heat that ignites dry brush.',
'type': 'spell',
});
db.insert({
'id': 'iron-boar',
'url': '/creatures/iron-boar',
'title': 'Iron Boar',
'content': 'A plated beast known for explosive charges.',
'type': 'monster',
});
final results = db.search(
term: 'ember',
properties: const ['title', 'content'],
);
for (final hit in results.hits) {
print('${hit.score.toStringAsFixed(2)} ${hit.document.getString('title')}');
}
await db.dispose();
}
Core Workflow
Searchlight does not extract your source data for you. Your app or tooling is responsible for turning content into records, and Searchlight handles the indexing and querying.
The common integration flow is:
- Read or receive source content.
- Convert it into structured records.
- Insert those records into a
Searchlightdatabase. - Persist the built index if you want fast startup later.
- Restore the persisted index and query it at runtime.
This applies equally to:
- App-bundled JSON or markdown content
- Remote content downloaded and cached on device
- User-imported files such as PDFs after text extraction
If your app needs reusable extraction, keep that conversion layer in your app or in a companion package. For small integrations, simple record-conversion functions are often enough.
What Searchlight Can Index
Searchlight indexes schema-shaped records, not raw files.
That means the core package directly supports:
Map<String, Object?>records inserted withinsert()- persisted snapshots restored with
restore()orfromJson() - any source format that your app converts into those records first
The core package does not currently include built-in parsers for:
- Markdown files
- HTML files
- PDF files
- CSV, XML, or other file formats
If you insert raw HTML or Markdown into a string field yourself, Searchlight
will tokenize that raw text. It will not strip tags, ignore attributes, or
understand Markdown structure automatically. In practice, that means markup
tokens and link-destination fragments can become searchable unless you clean or
extract the text first.
In this repository specifically:
- the core package accepts records and snapshots only
- the validation example's live folder mode currently reads
.mdfiles only - the validation assets are JSON corpus and JSON snapshot files
Choose the Right Runtime Pattern
There are two common integration modes:
- Build in memory from records
- best for tests, small corpora, and validation
- create
Searchlight, insert records, search immediately
- Restore from a persisted snapshot
- best for production apps with a non-trivial corpus
- build once, persist, then restore on future launches
The package supports both paths directly.
The repository validation workflow exercises both:
- public fixture corpus -> build in memory -> search
- generated local corpus -> build in memory -> search
- generated local snapshot -> restore persisted index -> search
For the exact command sequence, see doc/validation-workflow.md.
Document writes are available through:
insert()/insertMultiple()update()/updateMultiple()upsert()/upsertMultiple()patch()remove()/removeMultiple()
Extensions
Searchlight exposes a Dart-native create-time extension surface:
SearchlightPluginis the registration unit- lifecycle hooks register through top-level
SearchlightPluginfields such asbeforeInsert,afterSearch, andafterCreate SearchlightComponentscan replace the activetokenizer,index,sorter,documentsStore, orpinning, and can overridevalidateSchema,getDocumentIndexId,getDocumentProperties, andformatElapsedTime
This is enough to prove real component replacement. The test suite includes
plugin-driven index swaps that force PT15 and QPS behavior through the plugin
path rather than through the top-level algorithm flag alone.
Current limits to know before depending on extensions heavily:
- registration unit:
SearchlightPlugin - supported replacement surface:
tokenizer,index,sorter,documentsStore,pinning,validateSchema,getDocumentIndexId,getDocumentProperties, andformatElapsedTime - hooks are sync-only in core operations; async hooks fail fast
- restore contract: extension-backed snapshots must be restored with matching plugin order and compatible component IDs
- conflicting component registrations now fail fast instead of using last-writer-wins resolution
- hook coverage is intentionally limited to the documented
SearchlightPluginfields; there is no broader async initialization surface
Deeper parity notes live in
docs/research/searchlight-extension-status.md.
Defining a Schema
Every database is created from a schema. String fields are searchable by full text. Other field types support filtering, grouping, sorting, or geosearch.
| SchemaType | Dart type | Primary use |
|---|---|---|
string |
String |
Full-text search |
number |
num |
Range filters and sorting |
boolean |
bool |
Boolean filters |
enumType |
String or num |
Facets and exact-match filters |
geopoint |
GeoPoint |
Geo radius and polygon filters |
stringArray |
List<String> |
Full-text search over multiple values |
numberArray |
List<num> |
Numeric filtering |
booleanArray |
List<bool> |
Boolean filtering |
enumArray |
List<String> or List<num> |
Facets and filters |
NestedField |
nested object | Dot-path access such as meta.rating |
Searching
Searchlight supports full-text search with optional filters and result shaping.
final result = db.search(
term: 'ember lance',
properties: const ['title', 'content'],
tolerance: 1,
limit: 10,
offset: 0,
where: {
'type': eq('spell'),
},
sortBy: const SortBy(field: 'title', order: SortOrder.asc),
);
Useful search options:
properties: limit search to specific string fieldswhere: apply typed filterstolerance: allow fuzzy term matchesexact: require whole-word matches after scoringlimitandoffset: paginatesortBy: sort on sortable fieldsfacets: collect counts for enum and numeric fieldsgroupBy: group matching hits by one or more fields
Choosing a Search Algorithm
Searchlight supports three ranking algorithms:
SearchAlgorithm.bm25: default general-purpose relevance rankingSearchAlgorithm.qps: proximity-aware scoring optimized for faster search and smaller indexesSearchAlgorithm.pt15: position-aware scoring that can work well when term order and early-token placement matter
Choose the algorithm when creating the database:
final db = Searchlight.create(
schema: schema,
algorithm: SearchAlgorithm.qps,
);
Or rebuild an existing database with a different algorithm:
final qpsDb = db.reindex(algorithm: SearchAlgorithm.qps);
PT15 has important query limitations:
toleranceis not supportedexactis not supported- string-field
wherefilters are not supported
If you need the broadest query feature support, stay with bm25.
Filtering, Facets, and Grouping
final result = db.search(
term: 'boar',
where: {
'type': eq('monster'),
},
facets: {
'type': const FacetConfig(),
},
groupBy: const GroupBy(field: 'type', limit: 5),
);
Supported filters include eq, gt, gte, lt, lte, between,
inFilter, ninFilter, filterContainsAll, filterContainsAny,
geoRadius, geoPolygon, and, or, and not.
Persistence
If you have a non-trivial corpus, build the index once and persist it. Restoring a saved index is usually the right runtime path for production apps.
Future<void> example(Searchlight db) async {
final storage = FileStorage(path: 'search-index.cbor');
await db.persist(storage: storage);
final restored = await Searchlight.restore(storage: storage);
final result = restored.search(term: 'ember');
await restored.dispose();
}
FileStorage is intended for dart:io platforms. If you want persisted JSON
instead of CBOR, pass format: PersistenceFormat.json to both persist() and
restore(). On web or in a custom app storage layer, implement your own
SearchlightStorage or use toJson() and fromJson() directly.
Persistence supports reconstructible Searchlight.create() tokenizer settings
such as stemming toggles, stop words, duplicate handling, and skip-property
sets. Databases created with an injected Tokenizer or custom stemmer callback
must be rebuilt instead of serialized.
If a snapshot was created with plugins or replacement components, restore it with the same plugin order and compatible component IDs. Searchlight stores extension compatibility metadata in the snapshot and rejects mismatched restore graphs instead of silently loading into the wrong runtime shape.
You can also work directly with JSON-compatible maps:
void example(Searchlight db) {
final json = db.toJson();
final restored = Searchlight.fromJson(json);
restored.dispose();
}
Highlighting and Excerpts
Use the companion package
searchlight_highlight
after search to build excerpts or render marked matches. It does not change
how documents are indexed.
import 'package:searchlight_highlight/searchlight_highlight.dart';
String buildExcerpt(SearchHit hit) {
final highlighter = Highlight();
final text = hit.document.getString('content');
final highlight = highlighter.highlight(text, 'ember');
return highlight.trim(160);
}
This is a good fit for:
- Search result snippets
- Inline
<mark>orTextSpanrendering - Page-level excerpt generation in Flutter UI
App Integration Pattern
For most apps, you will want a small indexing layer that sits above Searchlight.
Example pattern:
- Define the record shape your app will search.
- Convert your content into that shape.
- Build or restore the index in a repository/service.
- Query from your UI layer.
- Use
searchlight_highlightto render excerpts.
The package includes a practical reference implementation:
example/shows a Flutter validation app for fixture, snapshot, and desktop-folder indexing flowsexample/tool/build_validation_assets.dartshows a simple extraction-to-index flow used by the example
For a fuller walkthrough, see doc/app-integration.md.
Validation Example
The package includes a validation workflow with:
- Public-safe fixture data under
test/fixtures/ - An example-owned local-only
.local/corpus flow for private validation - A Flutter example app that can load either raw records or a persisted snapshot
See:
License
Apache License 2.0. See LICENSE.
Searchlight is an independent pure Dart reimplementation of Orama. It is not affiliated with or endorsed by the Orama project. See NOTICE for attribution.
Libraries
- searchlight
- A full-text search engine for Dart.