flutter_gemma_embedder 0.0.1-dev.1
flutter_gemma_embedder: ^0.0.1-dev.1 copied to clipboard
Flutter plugin for on-device embedder, inspired by EmbeddingGemma. Generate text embeddings locally for semantic search and similarity tasks.
flutter_gemma_embedder #
A Flutter plugin for running EmbeddingGemma models locally on mobile devices. Generate high-quality text embeddings for semantic search, retrieval, and similarity tasks using Google's EmbeddingGemma 300M model.
Features #
- 🧠 On-Device AI: Run EmbeddingGemma 300M locally without internet
- 🔍 Semantic Search: Generate embeddings for text similarity and search
- 📱 Cross-Platform: Android, iOS, and Web support
- ⚡ Multiple Variants: Support for different sequence lengths (256, 512, 1024, 2048 tokens)
- 🎯 Task-Specific: Optimized for retrieval tasks with proper prompting
- 🚀 GPU/CPU Backend: Choose optimal backend for your device
- 🔒 Privacy-First: All processing happens locally on device
Supported Models #
Model | Sequence Length | Model Size | Use Case |
---|---|---|---|
EmbeddingGemma 300M (256) | 256 tokens | 179MB | Mobile & Real-time (~200 words) |
EmbeddingGemma 300M (512) | 512 tokens | 187MB | General Purpose (~400 words) |
EmbeddingGemma 300M (1024) | 1024 tokens | 191MB | Content Analysis (~800 words) |
EmbeddingGemma 300M (2048) | 2048 tokens | 196MB | Research & Documents (~1600 words) |
All models generate 768-dimensional embeddings optimized for retrieval tasks.
Platform Support #
Platform | Status | Notes |
---|---|---|
Android | ✅ Full | GPU and CPU backends |
iOS | ✅ Full | GPU and CPU backends |
Web | 🔶 Partial | CPU backend only |
Installation #
Add this to your pubspec.yaml
:
dependencies:
flutter_gemma_embedder: ^0.10.4
Quick Start #
1. Initialize the Plugin #
import 'package:flutter_gemma_embedder/flutter_gemma_embedder.dart';
final embedder = FlutterGemmaEmbedder.instance;
2. Create and Load a Model #
// Create model instance
final model = await embedder.createModel(
modelPath: '/path/to/model.tflite',
modelType: EmbeddingModelType.embeddingGemma300M,
dimensions: 768,
taskType: EmbeddingTaskType.retrieval,
backend: PreferredBackend.gpu,
);
// Initialize the model
await model.initialize();
3. Generate Embeddings #
// Generate embedding for a single text
final embedding = await model.encode('Your text here');
print('Embedding dimensions: ${embedding.length}');
// Generate embeddings for multiple texts
final embeddings = await model.batchEncode([
'First document',
'Second document',
'Third document',
]);
4. Calculate Similarity #
final text1 = 'Flutter is a UI toolkit';
final text2 = 'Flutter helps build mobile apps';
final embedding1 = await model.encode(text1);
final embedding2 = await model.encode(text2);
final similarity = model.cosineSimilarity(embedding1, embedding2);
print('Similarity: ${similarity.toStringAsFixed(4)}'); // 0.8234
Advanced Usage #
Task-Specific Prompting #
The plugin automatically applies task-specific prompts for optimal embeddings:
// For retrieval tasks (default)
final model = await embedder.createModel(
// ... other params
taskType: EmbeddingTaskType.retrieval,
);
// The plugin will automatically format your text with:
// "Represent this sentence for searching relevant passages: YOUR_TEXT"
final embedding = await model.encode('machine learning algorithms');
Matryoshka Embeddings #
Reduce embedding dimensions for faster processing:
// Use only first 512 dimensions instead of full 768
final embedding = await model.encode(
'Your text here',
outputDimensionality: 512,
);
Batch Processing #
Process multiple texts efficiently:
final documents = [
'Document 1 content',
'Document 2 content',
'Document 3 content',
];
final embeddings = await model.batchEncode(documents);
// Find most similar document to query
final query = 'search query';
final queryEmbedding = await model.encode(query);
double bestSimilarity = -1;
int bestIndex = -1;
for (int i = 0; i < embeddings.length; i++) {
final similarity = model.cosineSimilarity(queryEmbedding, embeddings[i]);
if (similarity > bestSimilarity) {
bestSimilarity = similarity;
bestIndex = i;
}
}
print('Most similar document: ${documents[bestIndex]}');
print('Similarity score: ${bestSimilarity.toStringAsFixed(4)}');
Model Management #
Download Models #
Models need to be downloaded and stored locally. The plugin provides utilities for model management:
import 'package:flutter_gemma_embedder_example/models/embedding_model_config.dart';
// Choose a model configuration
final config = EmbeddingModelConfig.embeddingGemma300M_seq512;
// Download and store the model
// (Implementation depends on your download strategy)
Model File Structure #
Store your .tflite
model files in the app's documents directory:
Documents/
└── embeddinggemma-300M_seq512_mixed-precision.tflite
Performance Tips #
Backend Selection #
- GPU Backend: Faster inference, higher memory usage
- CPU Backend: Lower memory usage, slower inference
// For better performance on newer devices
final model = await embedder.createModel(
// ... other params
backend: PreferredBackend.gpu,
);
// For memory-constrained devices
final model = await embedder.createModel(
// ... other params
backend: PreferredBackend.cpu,
);
Model Selection #
Choose the right model variant for your use case:
- 256 tokens: Fast inference for short texts (tweets, titles)
- 512 tokens: Balanced performance for medium texts (paragraphs)
- 1024 tokens: High capacity for long texts (articles)
- 2048 tokens: Maximum capacity for very long texts (documents)
Example App #
The plugin includes a complete example app demonstrating:
- Model selection with filtering and sorting
- Model download with progress tracking
- Text embedding generation
- Similarity comparison
- Real-time inference
Run the example:
cd example
flutter run
API Reference #
FlutterGemmaEmbedder #
Main plugin singleton for creating embedding models.
class FlutterGemmaEmbedder {
static FlutterGemmaEmbedder get instance;
Future<EmbeddingModel> createModel({
required String modelPath,
required EmbeddingModelType modelType,
required int dimensions,
required EmbeddingTaskType taskType,
required PreferredBackend backend,
});
}
EmbeddingModel #
Core class for text embedding operations.
class EmbeddingModel {
Future<void> initialize();
Future<List<double>> encode(String text, {int? outputDimensionality});
Future<List<List<double>>> batchEncode(
List<String> texts,
{int? outputDimensionality}
);
double cosineSimilarity(List<double> a, List<double> b);
void dispose();
}
Enums #
enum EmbeddingModelType {
embeddingGemma300M,
}
enum EmbeddingTaskType {
retrieval,
// More task types may be added in future versions
}
enum PreferredBackend {
cpu,
gpu,
}
Requirements #
Android #
- Minimum SDK: 21 (Android 5.0)
- Target SDK: 34 (Android 14)
- NDK support for TensorFlow Lite
iOS #
- iOS 12.0+
- TensorFlow Lite Swift framework
Web #
- Modern browsers with WebAssembly support
- CPU backend only
Troubleshooting #
Common Issues #
Model loading fails
- Ensure the model file exists at the specified path
- Check file permissions
- Verify model file is not corrupted
Out of memory errors
- Use CPU backend instead of GPU
- Choose smaller model variant (256 or 512 tokens)
- Process texts in smaller batches
Slow inference
- Use GPU backend on supported devices
- Choose appropriate model size for your use case
- Enable device GPU acceleration
Web platform limitations
- Only CPU backend is supported
- Large models may cause memory issues
- Consider using smaller model variants
Getting Help #
License #
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.
Acknowledgments #
- Google AI for the EmbeddingGemma models
- TensorFlow Lite for on-device inference
- Flutter team for the excellent plugin architecture
Built with ❤️ by the Flutter community