Mobile RAG Engine #

pub package flutter rust platform

Production-ready, fully local RAG (Retrieval-Augmented Generation) engine for Flutter.

Powered by a Rust core, it runs vector search and embedding generation directly on the device. No servers, no API costs, no network round-trips.

Why this package? #

No Rust Installation Required #

You do NOT need to install Rust, Cargo, or Android NDK.

This package includes pre-compiled binaries for iOS, Android, and macOS. Just pub add and run.

Performance #

Feature	Pure Dart	Mobile RAG Engine (Rust)
Tokenization	Slow	HuggingFace `tokenizers` (Rust)
Vector Search	O(n)	HNSW Index — sub-linear retrieval
Memory Usage	High	Copy-minimized Rust core, `Float32List` zero-copy transport

Numbers vary by device and corpus. See benchmark_service and the 0.18.0 retrieval-hot-path notes in CHANGELOG.md for measured deltas on your own hardware.

100% Offline & Private #

Data never leaves the user's device. Perfect for privacy-focused apps (journals, secure chats, enterprise tools).

Features #

End-to-End RAG Pipeline #

End-to-End RAG Pipeline

One package, complete pipeline. From any document format to LLM-ready context.

Key Features #

Category	Features
Document Input	PDF, DOCX, Markdown, Plain Text with smart dehyphenation; file-path and UTF-8 ingest fast paths
Chunking	Plain-text paragraph/line chunking with heading-aware split and tokenizer hard guard; Markdown structure-aware chunking with header-path metadata
Search	HNSW vector + BM25 keyword hybrid search with RRF fusion; metadata-first search with explicit context/chunk hydration
Storage	SQLite persistence, HNSW Index persistence (fast startup), connection pooling, resumable indexing
Collections	Collection-scoped ingest/search/rebuild via `inCollection('id')`
Performance	Rust core, 10x faster tokenization, thread control, memory optimized
Context	Engine-tokenizer exact context budget, adjacent chunk expansion, single source mode

Requirements #

Platform	Minimum Version
iOS	13.0+
Android	API 21+ (Android 5.0 Lollipop)
macOS	10.15+ (Catalina)

ONNX Runtime is bundled automatically via the onnxruntime plugin. No additional native setup required.

Installation #

1. Add the dependency #

dependencies:
  mobile_rag_engine:

2. Download Model Files #

# Create assets folder
mkdir -p assets && cd assets

# Download all-MiniLM-L6-v2 model (INT8 quantized for ARM64, ~23MB)
curl -L -o model.onnx "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model_qint8_arm64.onnx"
curl -L -o tokenizer.json "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json"

Need multilingual (Korean, CJK, etc.)? See Model Setup Guide for BGE-m3 and other model options.

Quick Index #

Features #

Adjacent Chunk Retrieval - Fetch surrounding context.
Index Management - Stats, persistence, and recovery.
Markdown Chunker - Structure-aware text splitting.
Multi-Collection - Isolate ingest/search/rebuild by category.
Prompt Compression - Reduce token usage.
Search by Source - Filter results by document.
Search Strategies - Tune ranking and retrieval.

Guides #

Quick Start - Setup in 5 minutes.
Model Setup - Choosing and downloading models.
Release Build - Bundle size optimization for production.
Troubleshooting - Common fixes.
FAQ - Frequently asked questions.

Testing #

Unit Testing - Mocking for isolated tests.

Usage #

Minimal Setup #

Initialize the engine once in your main() function. See the Quick Start Guide for the full parameter table.

await MobileRag.initialize(
  tokenizerAsset: 'assets/tokenizer.json',
  modelAsset: 'assets/model.onnx',
  deferIndexWarmup: true,
);

// Before first search, wait for BM25/HNSW warmup if you deferred it:
if (!MobileRag.instance.isIndexReady) {
  await MobileRag.instance.warmupFuture;
}

Adding Documents and Searching #

class MySearchScreen extends StatelessWidget {
  Future<void> _search() async {
    // 1. Add Documents (auto-chunked & embedded). Indexing is auto-managed
    //    (debounced ~500ms) — only call rebuildIndex() if you need it now.
    await MobileRag.instance.addDocument(
      'Flutter is a UI toolkit for building apps.',
    );

    // File / UTF-8 fast paths are useful for large local documents.
    await MobileRag.instance.addDocumentFromFile('/path/to/manual.pdf');
    final noteBytes = await File('/path/to/notes.md').readAsBytes();
    await MobileRag.instance.addDocumentUtf8(noteBytes, name: 'notes.md');

    // 2. Search with LLM-ready context
    final result = await MobileRag.instance.search(
      'What is Flutter?',
      tokenBudget: 2000,
    );
    print(result.context.text); // Ready to send to LLM
  }
}

Handling File Picker Fallback #

addDocumentFromFile is the fastest path because the Rust core reads and chunks the file directly. Some platform pickers (cloud-backed pickers, content URIs without a stable local path, etc.) return data that is not exposed as a real filesystem path. In those cases, fall back to UTF-8 or parsed-text ingestion:

try {
  await MobileRag.instance.addDocumentFromFile(path, name: fileName);
} on RagError {
  final bytes = await File(path).readAsBytes();
  final lower = fileName.toLowerCase();
  if (lower.endsWith('.txt') ||
      lower.endsWith('.md') ||
      lower.endsWith('.markdown')) {
    await MobileRag.instance.addDocumentUtf8(bytes, name: fileName);
  } else {
    final text = await DocumentParser.parse(bytes);
    await MobileRag.instance.addDocument(text, name: fileName);
  }
}

Metadata-First Search #

Use searchMeta when you want lightweight search metadata first, then explicitly assemble context or hydrate only the chunks you need.

final meta = await MobileRag.instance.searchMeta(
  'What is Flutter?',
  topK: 10,
);

try {
  final context = await MobileRag.instance.assembleContext(
    searchHandle: meta.handle,
    tokenBudget: 2000,
  );

  final chunkIds = meta.hits.map((hit) => hit.chunkId.toInt()).toList();
  final chunks = await MobileRag.instance.hydrateChunks(
    searchHandle: meta.handle,
    chunkIds: chunkIds,
  );
  final excerpts = await MobileRag.instance.getChunkExcerpts(
    searchHandle: meta.handle,
    chunkIds: chunkIds,
    maxBytes: 256,
  );

  print(context.text);
  print('hydrated=${chunks.length}, excerpts=${excerpts.length}');
} finally {
  await meta.handle.dispose();
}

Multi-Collection (v1) #

Use collection scopes when you want independent rebuild boundaries per category.

final business = MobileRag.instance.inCollection('business');
final travel = MobileRag.instance.inCollection('travel');

await business.addDocument('Quarterly planning memo...');
await travel.addDocument('Kyoto itinerary...');

if (!travel.isIndexReady) {
  await travel.warmupFuture;
}
final travelHits = await travel.searchHybrid('itinerary');
print(travelHits.length);

If you do not specify a collection, the engine uses the default __default__ collection for backward compatibility.

Advanced Usage: For fine-grained control, use the high-level metadata lane (searchMeta, assembleContext, hydrateChunks, getChunkExcerpts) and the public API reference. Most apps should stay on the MobileRag facade.

Sample App #

Check out the example application using this package. This desktop app demonstrates full RAG pipeline integration with an LLM (Gemma 2B) running locally on-device.

mobile-ondevice-rag-desktop

Sample App Screenshot

Contributing #

Bug reports, feature requests, and PRs are all welcome!

License #

This project is licensed under the MIT License.

mobile_rag_engine 0.18.3
mobile_rag_engine: ^0.18.3 copied to clipboard

Metadata

Mobile RAG Engine #

Why this package? #

No Rust Installation Required #

Performance #

100% Offline & Private #

Features #

End-to-End RAG Pipeline #

Key Features #

Requirements #

Installation #

1. Add the dependency #

2. Download Model Files #

Quick Index #

Features #

Guides #

Testing #

Usage #

Minimal Setup #

Adding Documents and Searching #

Handling File Picker Fallback #

Metadata-First Search #

Multi-Collection (v1) #

Sample App #

Contributing #

License #

← Metadata

Documentation

Publisher

Weekly Downloads

Metadata

Topics

License

Dependencies

More

mobile_rag_engine 0.18.3 mobile_rag_engine: ^0.18.3 copied to clipboard

Metadata

Mobile RAG Engine #

Why this package? #

No Rust Installation Required #

Performance #

100% Offline & Private #

Features #

End-to-End RAG Pipeline #

Key Features #

Requirements #

Installation #

1. Add the dependency #

2. Download Model Files #

Quick Index #

Features #

Guides #

Testing #

Usage #

Minimal Setup #

Adding Documents and Searching #

Handling File Picker Fallback #

Metadata-First Search #

Multi-Collection (v1) #

Sample App #

Contributing #

License #

← Metadata

Documentation

Publisher

Weekly Downloads

Metadata

Topics

License

Dependencies

More

mobile_rag_engine 0.18.3
mobile_rag_engine: ^0.18.3 copied to clipboard