mobile_rag_engine 0.4.0
mobile_rag_engine: ^0.4.0 copied to clipboard
A high-performance, on-device RAG (Retrieval-Augmented Generation) engine for Flutter. Run semantic search completely offline on iOS and Android with HNSW vector indexing.
Changelog #
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.4.0 #
Changed #
- README Cleanup: Removed all emojis and unnecessary sections for cleaner documentation
0.3.8 #
Changed #
- ONNX Runtime: Reverted to
onnxruntime ^1.4.1for CocoaPods compatibility (1.23.2 not yet available) - README: Added benchmark result screenshots (iOS/Android) and architecture diagram
- Platform Support: Removed Linux/Windows from publish (no pre-compiled binaries available)
Removed #
- ChunkingTestScreen: Removed unnecessary test screen from example app
Added #
- Android Platform: Added Android support to example app
0.3.7 #
Changed #
- ONNX Runtime Upgrade: Migrated from
onnxruntimetoonnxruntime_v2(v1.23.2) with optional GPU acceleration support - README Remake: Completely redesigned README with "No Rust Installation Required" emphasis, accurate benchmarks, and Mermaid architecture diagram
- Benchmark UI Overhaul: Visual separation of Rust-powered (fast) vs ONNX (standard) operations with category headers and icons
Added #
- GPU Acceleration Option:
EmbeddingService.init()now acceptsuseGpuAccelerationparameter (CoreML/NNAPI support, disabled by default) - macOS Support for Example App: Example app now supports macOS platform
- Benchmark Categories: Results now grouped by
BenchmarkCategory.rustandBenchmarkCategory.onnx
Fixed #
- Pub Point Warning: Removed non-existent
assets/directory reference from pubspec.yaml - Static Analysis: Fixed all lint issues (unnecessary imports, avoid_print, curly braces)
0.3.5 #
- Globalization: Removed all Korean text and logic, replaced with English.
- Updated prompt builder and semantic chunker for better international support.
- Updated default language settings to English.
0.3.4 #
- Fix model download URLs in README (use correct Teradata/bge-m3 and BAAI/bge-m3 sources)
- Add production model deployment strategies guide
0.3.3 #
- Improve README with Quick Start guide and model download instructions
- Update to pub.dev dependency instead of git
0.3.2 #
- Update
rag_engine_flutterdependency to^0.3.0(fixes platform directory issue)
0.3.1 - 2026-01-08 #
Fixed #
- Package structure fix: Update
rag_engine_flutterdependency to v0.2.0 which includes rust/ source
0.3.0 - 2026-01-08 #
0.2.0 - 2024-12-08 #
Added #
- LLM-Optimized Chunking: Introduced
ChunkingServicewith Recursive Character Splitting and Overlap support. - Improved Data Model: Separated storage into
Source(original document) andChunk(searchable parts). - Context Assembly: Added
ContextBuilderto intelligently assemble LLM context within a token budget. - High-Level API: New
SourceRagServicefor automated chunking, embedding, and indexing pipeline. - Context Strategies: Support for
relevanceFirst,diverseSources, andchronologicalcontext assembly strategies.
0.1.0 - 2024-12-08 #
Added #
- Initial release
- On-device semantic search with HNSW vector indexing
- Rust-powered tokenization via HuggingFace tokenizers
- ONNX Runtime integration for embedding generation
- SQLite-based vector storage with content deduplication
- Batch embedding support with progress callback
- Cross-platform support (iOS and Android)
Features #
initDb()- Initialize SQLite databaseaddDocument()- Add documents with SHA256 deduplicationsearchSimilar()- HNSW-based semantic searchrebuildHnswIndex()- Manual index rebuildEmbeddingService.embed()- Generate embeddingsEmbeddingService.embedBatch()- Batch embedding
Performance #
- HNSW search: O(log n) complexity
- Tokenization: ~0.8ms for short text
- Embedding: ~4ms for short text, ~36ms for long text
- Search (100 docs): ~1ms