mobile_rag_engine 0.10.0
mobile_rag_engine: ^0.10.0 copied to clipboard
A high-performance, on-device RAG (Retrieval-Augmented Generation) engine for Flutter. Run semantic search completely offline on iOS and Android with HNSW vector indexing.
Changelog #
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.10.0 #
- Testing Support: Added
mocktaildev dependency and fixed mock testing utilities. - Internal Refactoring: Improved
SourceRagServiceand internal APIs. - Documentation: Updated guides and examples.
0.9.3 #
- Core Engine Update: Incorporates
rag_engine_flutterv0.9.2 improvements.- Enhanced markdown chunking with structure preservation for code blocks and tables.
- Added linking metadata for split code blocks to support context reconstruction.
- Improved handling of large tables with automatic header repetition.
0.9.2 #
Changed #
- README Overhaul: Redesigned Features section with End-to-End RAG Pipeline diagram and Key Features table.
- Documentation: Streamlined structure by consolidating Architecture and Benchmarks into Features section.
- Visual Improvements: Updated all images to consistent width (860px) for better presentation.
0.9.1 #
0.9.0 #
Added #
- ThreadUseLevel API: New high-level thread configuration with
ThreadUseLevel.low(~20%),medium(~40%), andhigh(~80%) options for easier CPU usage control. - Memory Optimization: ONNX model is now loaded from file instead of memory buffer, reducing Dart heap usage by ~20-50MB (model size).
Changed #
- Documentation: Updated README and Quick Start guide with
threadLevelparameter and full parameter table. - Architecture Section: Updated to reflect Hybrid Search (HNSW + BM25 with RRF fusion).
Fixed #
- API Consistency:
threadLevelandembeddingIntraOpNumThreadsare now mutually exclusive (throwsAssertionErrorif both set).
0.8.0 #
Added #
- Independent Source Search (Exact Scan): When filtering by
sourceIds, the engine now switches to a brute-force scan of ALL chunks in that source. This guarantees perfect recall within a specific document, bypassing global index limitations. - Advanced Documentation: Expanded "Quick Start" guide with "Advanced Features" (Cached Index, LLM Context, usage of
searchHybridWithContext). - API Improvements: Exported
SourceStatstype for easier usage ofgetStats().
Fixed #
- PDF Text Extraction: Improved "Smart Dehyphenation" to correctly handle broken newlines in Korean text (joining words split by line breaks incorrectly).
- Example App: Fixed crash when deleting sources caused by incorrect
BuildContext.
0.7.11 #
Fixed #
- Reverted Model Integration: Reverted changes related to
ko-srobertaintegration due to ONNX runtime compatibility issues (Invalid Feed Input Name:token_type_ids). - Stability: Restored original embedding logic compatible with standard models (e.g.,
bge-m3,all-MiniLM-L6-v2).
0.7.10 (Withdrawn) #
- Attempted
ko-srobertaintegration (caused runtime errors).
0.7.9 #
Fixed #
- Library Exports: Added missing exports for
BenchmarkService,QualityTestService,PromptCompressor, andSemanticChunktypes. Now all services are accessible via the main library import. - Example App: Fixed internal import paths in example code to use the public API.
0.7.8 #
Changed #
- API Clean-up: Refactored global functions into namespaced classes for better DX.
extractTextFrom*→DocumentParser.*parseUserIntent→IntentParser.classify
- Error Handling: Exported
RagErrorclass for proper error catching. - Documentation:
- Updated
quick_start.mdandexample/example.mdto match the new API. - Updated
mobile_rag_engine.dartAPI usage examples.
- Updated
0.7.6 #
Fixed #
- Duplicate Logs: Fixed issue where Rust logs were printed twice (both to console and Dart stream).
- Log Format: Simplified log format from
[Rust] [INFO] messageto[INFO] message.
Changed #
- Logger: Rust logger now only uses
println!when Dart stream is not connected, avoiding duplicate output.
0.7.5 #
Fixed #
- BM25 Search: Fixed critical bug where BM25 index was never built for Source RAG, causing Hybrid Search to only use Vector search.
- Hybrid Search Accuracy: BM25 keyword matching now works correctly alongside Vector similarity search.
Changed #
- Initialization: Both HNSW and BM25 indexes are now rebuilt on app startup to ensure Hybrid Search works immediately.
- Index Rebuild:
rebuildIndex()now rebuilds both HNSW (vector) and BM25 (keyword) indexes.
Added #
rebuildChunkBm25Index(): New low-level API for manually rebuilding BM25 index (internal use).
0.7.1 #
Documentation #
- README: Added "Sample App" section with screenshot and link to
mobile-ondevice-rag-desktopexample app.
0.7.0 #
Added #
- Hybrid Search API: New
searchHybrid()combining Vector and BM25 search for better accuracy. - Context Assembly: New
searchHybridWithContext()generates optimized prompts for LLMs. - Metadata Support:
addDocument()now acceptsmetadata(e.g., filenames, page numbers), which is preserved in search results.
Changed #
- Prompt Format: Converted LLM context format to use XML tags (
<document>...) instead of text headers for better parsing by modern LLMs. - Internal: Updated to use
rag_engine_flutter0.7.0 with schema changes.
0.6.0 #
Added #
- DB Connection Pool: Implemented
r2d2based connection pooling - Performance: Search operations are now 50-90% faster (100ms -> 11ms)
- Automatic Initialization:
RagEnginenow automatically manages connection pool lifecycle
Changed #
- Internal: Refactored database operations to share connections efficiently
- API: Internal Rust API no longer requires
db_pathfor every operation - README Quick Start: Updated to showcase new simplified
RagEngineAPI - Documentation: Rewrote
docs/guides/quick_start.mdwithRagEngineexamples - Example app: Refactored
main.dartto useRagEngineinstead of manual initialization - Library exports: Updated
mobile_rag_engine.dartwith new Quick Start example in docstring
Migration Guide #
Before (0.4.x):
final dir = await getApplicationDocumentsDirectory();
await _copyAssetToFile('assets/tokenizer.json', tokenizerPath);
await initTokenizer(tokenizerPath: tokenizerPath);
final modelBytes = await rootBundle.load('assets/model.onnx');
await EmbeddingService.init(modelBytes.buffer.asUint8List());
_ragService = SourceRagService(dbPath: dbPath);
await _ragService!.init();
After (0.5.0):
final rag = await RagEngine.initialize(
config: RagConfig.fromAssets(
tokenizerAsset: 'assets/tokenizer.json',
modelAsset: 'assets/model.onnx',
),
);
0.5.3 #
Added #
- Singleton Pattern: Introduced
MobileRagclass for simplified, global access to the engineMobileRag.initialize(): Single-line initialization that handles Rust FFI, Config, and DatabaseMobileRag.instance: Static accessor for using the engine anywhere in the app
- Auto-Initialization: Eliminated the need to manually call
RustLib.init()
Changed #
- API Exports: Hides low-level Rust API by default to improve IDE auto-completion relevance
- Documentation: Updated all guides and examples to use the new
MobileRagsingleton pattern
0.5.0 #
Added #
RagEngineclass: New unified API that simplifies initialization from ~40 lines to ~3 linesRagEngine.initialize()handles tokenizer, ONNX model, and database setup automaticallyRagEngine.initialize()handles tokenizer, ONNX model, and database setup automaticallyRagConfig.fromAssets()for convenient asset-based configuration- Delegates to
SourceRagServiceinternally, maintaining full functionality
RagConfigclass: Configuration object forRagEnginewith chunking and database options- Progress callback:
onProgressparameter inRagEngine.initialize()for status updates
0.4.4 #
Added #
- Documentation: Added comprehensive guides in
docs/guides/:quick_start.md- Get started in 5 minutesmodel_setup.md- Model selection, download, deployment strategiesfaq.md- Frequently asked questionstroubleshooting.md- Problem solving guide
- README: Added Requirements section with platform minimum versions (iOS 13.0+, Android API 21+, macOS 10.15+)
- README: Added Documentation section with links to all guides
Changed #
- README: Enhanced Model Options table with dimensions, max tokens, and language support info
- README: Updated all doc links to absolute GitHub URLs for pub.dev compatibility
0.4.3 #
Added #
- PDF/DOCX Text Extraction: New
extractTextFromPdf(),extractTextFromDocx(), andextractTextFromDocument()functions - Markdown Structure-Aware Chunking: New
markdownChunk()function with header path inheritance, code block/table preservation - API: Added
removeSource(id)toSourceRagServicefor deleting documents - Smart Dehyphenation: Automatically rejoins words split by line breaks and page boundaries
- Page Number Removal: Strips standalone page numbers from PDF text extraction
- macOS Entitlements: Added file read permissions for macOS file picker support
- Documentation: Enhanced
example/example.mdwith PDF/DOCX handling and document management examples
Changed #
- Project Structure Cleanup: Removed duplicate
/rust/directory; consolidated Rust source torust_builder/rust/only - Flutter Rust Bridge Config: Updated
rust_rootpath influtter_rust_bridge.yaml - Rust Core: Added
pdf-extract,docx-lite, andregexcrates for document processing
Fixed #
- PDF Text Extraction: Fixed issue where paragraph breaks were removed during text normalization
- Safety: Added 50MB limit for document extraction to prevent OOM
0.4.0 #
Changed #
- README Cleanup: Removed all emojis and unnecessary sections for cleaner documentation
0.3.8 #
Changed #
- ONNX Runtime: Reverted to
onnxruntime ^1.4.1for CocoaPods compatibility (1.23.2 not yet available) - README: Added benchmark result screenshots (iOS/Android) and architecture diagram
- Platform Support: Removed Linux/Windows from publish (no pre-compiled binaries available)
Removed #
- ChunkingTestScreen: Removed unnecessary test screen from example app
Added #
- Android Platform: Added Android support to example app
0.3.7 #
Changed #
- ONNX Runtime Upgrade: Migrated from
onnxruntimetoonnxruntime_v2(v1.23.2) with optional GPU acceleration support - README Remake: Completely redesigned README with "No Rust Installation Required" emphasis, accurate benchmarks, and Mermaid architecture diagram
- Benchmark UI Overhaul: Visual separation of Rust-powered (fast) vs ONNX (standard) operations with category headers and icons
Added #
- GPU Acceleration Option:
EmbeddingService.init()now acceptsuseGpuAccelerationparameter (CoreML/NNAPI support, disabled by default) - macOS Support for Example App: Example app now supports macOS platform
- Benchmark Categories: Results now grouped by
BenchmarkCategory.rustandBenchmarkCategory.onnx
Fixed #
- Pub Point Warning: Removed non-existent
assets/directory reference from pubspec.yaml - Static Analysis: Fixed all lint issues (unnecessary imports, avoid_print, curly braces)
0.3.5 #
- Globalization: Removed all Korean text and logic, replaced with English.
- Updated prompt builder and semantic chunker for better international support.
- Updated default language settings to English.
0.3.4 #
- Fix model download URLs in README (use correct Teradata/bge-m3 and BAAI/bge-m3 sources)
- Add production model deployment strategies guide
0.3.3 #
- Improve README with Quick Start guide and model download instructions
- Update to pub.dev dependency instead of git
0.3.2 #
- Update
rag_engine_flutterdependency to^0.3.0(fixes platform directory issue)
0.3.1 - 2026-01-08 #
Fixed #
- Package structure fix: Update
rag_engine_flutterdependency to v0.2.0 which includes rust/ source
0.3.0 - 2026-01-08 #
0.2.0 - 2024-12-08 #
Added #
- LLM-Optimized Chunking: Introduced
ChunkingServicewith Recursive Character Splitting and Overlap support. - Improved Data Model: Separated storage into
Source(original document) andChunk(searchable parts). - Context Assembly: Added
ContextBuilderto intelligently assemble LLM context within a token budget. - High-Level API: New
SourceRagServicefor automated chunking, embedding, and indexing pipeline. - Context Strategies: Support for
relevanceFirst,diverseSources, andchronologicalcontext assembly strategies.
0.1.0 - 2024-12-08 #
Added #
- Initial release
- On-device semantic search with HNSW vector indexing
- Rust-powered tokenization via HuggingFace tokenizers
- ONNX Runtime integration for embedding generation
- SQLite-based vector storage with content deduplication
- Batch embedding support with progress callback
- Cross-platform support (iOS and Android)
Features #
initDb()- Initialize SQLite databaseaddDocument()- Add documents with SHA256 deduplicationsearchSimilar()- HNSW-based semantic searchrebuildHnswIndex()- Manual index rebuildEmbeddingService.embed()- Generate embeddingsEmbeddingService.embedBatch()- Batch embedding
Performance #
- HNSW search: O(log n) complexity
- Tokenization: ~0.8ms for short text
- Embedding: ~4ms for short text, ~36ms for long text
- Search (100 docs): ~1ms