model2vec 1.2.0
model2vec: ^1.2.0 copied to clipboard
A high-performance Dart wrapper for model2vec-rs using Rust FFI. Generate fast, local, and static text embeddings with minimal memory footprint using Native Assets.
model2vec #
High-performance, local text embeddings for Dart and Flutter. A Dart wrapper around model2vec-rs using Rust FFI and Native Assets. Model2Vec creates small, fast, and effective text embeddings by distilling knowledge from large language models into a simple vocabulary-based look-up table.
Table of Contents #
- model2vec
Key Features #
- Extreme Performance: Built on top of a highly optimized Rust engine. Up to ~1.7x faster than the official Python implementation, generating embeddings in microseconds.
- Compact & Quantized: Models are typically 25MB - 100MB. Perfect for edge computing.
- Massive Streaming: Built-in
generateEmbeddingStreamfor processing millions of rows without blocking the Event Loop or overflowing RAM. - Hugging Face Integration: Automatically downloads and caches models directly from the Hugging Face Hub.
- Zero-Stutter Async: Transparently runs heavy tokenization and math in background Dart Isolates using
Asyncmethods. - Vector Utilities: Ships with high-performance mathematical tools (
cosineSimilarity,quantizeToInt8,similaritySearch, etc.).
Recommended Models #
Model2Vec provides a variety of pre-trained models optimized for different use cases. These can be loaded directly via their Hugging Face model ID.
| Model ID | Language | Distilled From | Params | Dimension | Size |
|---|---|---|---|---|---|
minishlab/potion-base-32M |
English | bge-base-en-v1.5 | 32.3M | 512 | ~150MB |
minishlab/potion-multilingual-128M |
Multi | bge-m3 | 128M | 768 | ~500MB |
minishlab/potion-retrieval-32M |
English | bge-base-en-v1.5 | 32.3M | 512 | ~150MB |
minishlab/potion-code-16M |
Code | CodeRankEmbed | 16M | 384 | ~80MB |
minishlab/potion-base-8M |
English | bge-base-en-v1.5 | 7.5M | 256 | ~50MB |
minishlab/potion-base-4M |
English | bge-base-en-v1.5 | 3.7M | 128 | ~30MB |
minishlab/potion-base-2M |
English | bge-base-en-v1.5 | 1.8M | 64 | ~25MB |
Installation #
Add model2vec to your pubspec.yaml:
dependencies:
model2vec: any
Or add it using the command line:
dart pub add model2vec
Requires Dart SDK: 3.10.0+ and Rust toolchain: 1.86.0+ (to build the native library via Native Assets).
Quick Start #
import 'package:model2vec/model2vec.dart';
void main() {
final m2v = Model2Vec.instance;
// Initialize with a model from Hugging Face
m2v.initEmbedder('minishlab/potion-base-2M');
// Generate an embedding
final embedding = m2v.generateEmbedding('Dart FFI is blazingly fast 🚀');
print('Vector dimension: ${m2v.embeddingDimension}');
print('Vocabulary size: ${m2v.vocabularySize}');
}
Recipes & Patterns #
1. Advanced Batch Processing #
Process multiple strings at once for maximum hardware utilization. You can control sequence truncation and batch sizes.
final texts = ['Dart', 'Rust', 'Flutter'];
final embeddings = m2v.generateBatchEmbeddings(
texts,
maxLength: 256, // Truncate strings longer than 256 tokens
batchSize: 1024, // Internal chunks sent to the FFI layer
);
2. Massive Data Streaming #
When reading gigabytes of text from files or databases, loading everything into memory will crash the app. Use the Streaming API to handle data in chunks automatically.
import 'dart:convert';
import 'dart:io';
Future<void> processHugeFile() async {
final fileStream = File('massive_dataset.txt')
.openRead()
.transform(utf8.decoder)
.transform(const LineSplitter());
// Converts a Stream<String> into a Stream<Float32List>
final embeddingStream = m2v.generateEmbeddingStream(
fileStream,
batchSize: 500, // Process 500 strings at a time
useIsolate: true, // Run math in background threads
);
await for (final embedding in embeddingStream) {
saveToDb(embedding); // Memory safe!
}
}
3. Asynchronous Isolate Execution #
Never block the main thread. If you are building a Flutter app, always use the Async variants to perform generation in a background Isolate.
final embedding = await m2v.generateEmbeddingAsync('A very long text...');
final batch = await m2v.generateBatchEmbeddingsAsync(['A', 'B', 'C']);
4. Vector Math & Quantization #
The library ships with Model2VecUtils — a powerful suite of math operations tuned for embeddings.
final query = m2v.generateEmbedding('cat');
final candidates = [
m2v.generateEmbedding('dog'),
m2v.generateEmbedding('space'),
];
// 1. Semantic Similarity (Cosine)
final sim = Model2VecUtils.cosineSimilarity(query, candidates[0]);
// 2. Threshold Searching (Find all matches > 80%)
final matches = Model2VecUtils.similaritySearchWithThreshold(
query, candidates, threshold: 0.8,
);
// 3. Scalar Quantization (Compress Float32 to Int8 to save 4x RAM)
final compressed = Model2VecUtils.quantizeToInt8(query);
// 4. Mean Pooling (Average multiple vectors into one)
final sentenceVector = Model2VecUtils.meanPooling(candidates);
// 5. DB Serialization
final base64String = Model2VecUtils.toBase64(query);
API Reference #
Core Methods (Model2Vec class) #
| Method / Property | Description |
|---|---|
initEmbedder(path) |
Initializes the model from a Hugging Face repo ID or local path. |
initEmbedderAdvanced(...) |
Advanced initialization with custom cacheDirectory, hfToken, or normalize overrides. |
initEmbedderFromBytes(...) |
Initializes the model directly from raw Uint8List bytes (model.safetensors, tokenizer.json, etc). |
getRecommendedModels() |
Returns a list of officially supported models. |
tokenize(text) |
Runs the internal BPE tokenizer and returns a List<String>. |
generateEmbedding(text) |
Synchronously generates a Float32List embedding vector. |
generateBatchEmbeddings(texts) |
Synchronously generates embeddings for a List<String> using Rust SIMD. |
generateEmbeddingAsync(text) |
Asynchronously generates an embedding in a background Isolate. |
generateEmbeddingStream(stream) |
Processes a huge Stream<String> into a Stream<Float32List> in batches. |
embeddingDimension |
Property returning the vector size (e.g., 256, 384, 512). |
vocabularySize |
Property returning the number of tokens in the model's vocabulary. |
Math Utilities (Model2VecUtils class) #
| Method | Description |
|---|---|
cosineSimilarity(a, b) |
Calculates cosine similarity (-1.0 to 1.0) between two vectors. |
cosineDistance(a, b) |
Calculates cosine distance (0.0 to 2.0). |
euclideanDistance(a, b) |
Calculates Euclidean (L2) distance. |
similaritySearch(query, docs) |
Returns the indices of the Top-K most similar vectors in a database. |
similaritySearchWithThreshold |
Returns all indices with similarity above a given threshold. |
quantizeToInt8(vector) |
Compresses a Float32List into an Int8List (4x memory savings). |
normalize(vector) |
Applies L2 normalization to a vector. |
meanPooling(vectors) |
Averages multiple vectors into a single vector. |
toBase64 / fromBase64 |
Serializes/Deserializes a vector to/from a Base64 string for DB storage. |
Performance #
model2vec uses highly optimized FFI bindings. For mathematical operations on embeddings, Dart handles single-vector math natively with zero-overhead, while batch generation leverages Rust's SIMD (auto-vectorization) capabilities.
Here is a performance benchmark run on a typical machine (AOT compiled):
| Model | Load Time (Cache) | Single Embedding | Batch (32) |
|---|---|---|---|
minishlab/potion-base-2M |
~40 ms | 372.9 μs | 3.85 ms |
minishlab/potion-base-4M |
~40 ms | 363.7 μs | 4.19 ms |
minishlab/potion-base-8M |
~40 ms | 382.1 μs | 5.60 ms |
minishlab/potion-base-32M |
~120 ms | 452.6 μs | 6.79 ms |
minishlab/potion-multilingual-128M |
~1050 ms | 416.1 μs | 5.38 ms |
Note: Initial load times may vary slightly based on the disk speed. Generating an embedding takes just a few microseconds per string.
similaritySearchover 100,000 vectors takes <100ms in pure Dart.
Development & Contributing #
The library uses Dart Native Assets, meaning cargo build is invoked automatically when running Dart code.
To manually re-build bindings if you modify the Rust C-API (native/src/lib.rs):
dart run ffigen
To run the test suite:
dart test
License #
This project is licensed under the MIT License - see the LICENSE file for details.