liquid_ai 1.2.2
liquid_ai: ^1.2.2 copied to clipboard
Run on-device AI models in Flutter with LEAP SDK. Supports text generation, streaming chat, structured output, function calling, and vision.
liquid_ai #
Run powerful on-device AI models in your Flutter apps with the LEAP SDK. Supports text generation, streaming chat, structured JSON output, function calling, and vision models - all running locally on iOS and Android.
Features #
- On-Device Inference - Run AI models locally without internet connectivity
- Streaming Responses - Real-time token-by-token text generation
- Structured Output - Constrain model output to JSON schemas with automatic validation
- Function Calling - Define tools the model can invoke with typed parameters
- Vision Models - Analyze images with multimodal vision-language models
- Model Catalog - Browse and filter 20+ optimized models for different tasks
- Progress Tracking - Monitor download and loading progress with detailed events
- Resource Management - Efficient memory handling with explicit lifecycle control
Platform Support #
| Platform | Supported | Notes |
|---|---|---|
| iOS | Yes | iOS 17.0+, SPM (default) or CocoaPods |
| Android | Yes | API 31+ (Android 12) |
| macOS | No | Not yet supported |
| Web | No | Native inference only |
Quick Start #
Installation #
Add liquid_ai to your pubspec.yaml:
dependencies:
liquid_ai: ^1.2.0
iOS Setup #
Swift Package Manager (default, recommended):
SPM is enabled by default in Flutter 3.24+. No additional setup required.
CocoaPods (alternative):
If you need to use CocoaPods, add the LEAP SDK git source to your ios/Podfile:
target 'Runner' do
# Add LEAP SDK from git (required for v0.9.x)
pod 'Leap-SDK', :git => 'https://github.com/Liquid4All/leap-ios.git', :tag => 'v0.9.2'
pod 'Leap-Model-Downloader', :git => 'https://github.com/Liquid4All/leap-ios.git', :tag => 'v0.9.2'
# ... rest of your Podfile
end
Then disable SPM in your pubspec.yaml:
flutter:
config:
enable-swift-package-manager: false
Basic Usage #
import 'package:liquid_ai/liquid_ai.dart';
// Initialize the SDK
final liquidAi = LiquidAi();
// Find a model from the catalog
final model = ModelCatalog.findBySlug('LFM2.5-1.2B-Instruct')!;
const quantization = ModelQuantization.q4KM;
// Load the model (downloads if needed)
ModelRunner? runner;
await for (final event in liquidAi.loadModel(model.slug, quantization.slug)) {
if (event is LoadCompleteEvent) {
runner = event.runner;
}
}
// Create a conversation and generate text
final conversation = await runner!.createConversation(
systemPrompt: 'You are a helpful assistant.',
);
final response = await conversation.generateText('Hello!');
print(response);
// Clean up
await conversation.dispose();
await runner.dispose();
Model Loading #
Models are downloaded automatically on first use and cached locally. Track progress with load events:
// Use the catalog and enums for type safety
final model = ModelCatalog.findBySlug('LFM2.5-1.2B-Instruct')!;
const quantization = ModelQuantization.q4KM;
// Or use the model's default quantization
final defaultQuant = model.defaultQuantization;
await for (final event in liquidAi.loadModel(model.slug, quantization.slug)) {
switch (event) {
case LoadStartedEvent():
print('Starting download...');
case LoadProgressEvent(:final progress):
print('${(progress.progress * 100).toStringAsFixed(1)}%');
if (progress.speed != null) {
print('Speed: ${progress.speed! ~/ 1024} KB/s');
}
case LoadCompleteEvent(:final runner):
print('Ready!');
// Use runner to create conversations
case LoadErrorEvent(:final error):
print('Failed: $error');
case LoadCancelledEvent():
print('Cancelled');
}
}
Load Options #
Configure the inference engine when loading models:
await for (final event in liquidAi.loadModel(
model.slug,
quantization.slug,
options: LoadOptions(
contextSize: 4096, // Maximum context window
batchSize: 512, // Batch size for prompt processing
threads: 4, // Number of CPU threads
gpuLayers: 32, // Layers to offload to GPU (if available)
),
)) {
// Handle events...
}
Load from Local File #
Load a model directly from a file path (useful for custom or bundled models):
await for (final event in liquidAi.loadModelFromPath(
'/path/to/model.gguf',
options: LoadOptions(contextSize: 2048),
)) {
if (event is LoadCompleteEvent) {
runner = event.runner;
}
}
Model Status #
// Check if already downloaded
final downloaded = await liquidAi.isModelDownloaded(model.slug, quantization.slug);
// Get detailed status
final status = await liquidAi.getModelStatus(model.slug, quantization.slug);
// Delete to free storage
await liquidAi.deleteModel(model.slug, quantization.slug);
Download from URL (Hugging Face Support) #
Download models directly from any URL, including Hugging Face:
await for (final event in liquidAi.downloadModelFromUrl(
url: 'https://huggingface.co/user/model/resolve/main/model.gguf?download=true',
modelId: 'my-custom-model',
quantization: 'Q4_K_M', // Optional, defaults to 'custom'
)) {
switch (event) {
case DownloadProgressEvent(:final progress):
print('${(progress.progress * 100).toStringAsFixed(1)}%');
case DownloadCompleteEvent():
print('Download complete!');
case DownloadErrorEvent(:final error):
print('Error: $error');
default:
break;
}
}
// Then load the downloaded model
await for (final event in liquidAi.loadModel('my-custom-model', 'Q4_K_M')) {
// Handle load events...
}
Cache Management #
List and manage cached models:
// Check if a specific model is cached (useful for URL-downloaded models)
final isCached = await liquidAi.isModelCached('my-custom-model');
if (!isCached) {
// Download the model...
}
// List all cached models
final cachedModels = await liquidAi.getCachedModels();
for (final manifest in cachedModels) {
print('${manifest.modelSlug} (${manifest.quantizationSlug})');
print(' Path: ${manifest.localModelPath}');
}
// Delete all cached models to free storage
await liquidAi.deleteAllModels();
Model Manifest #
When a model is loaded, you can access extended metadata through the ModelManifest:
await for (final event in liquidAi.loadModel(model.slug, quantization.slug)) {
if (event is LoadCompleteEvent) {
final manifest = event.manifest;
if (manifest != null) {
print('Model: ${manifest.modelSlug}');
print('Quantization: ${manifest.quantizationSlug}');
print('Path: ${manifest.localModelPath}');
// Access via runner as well
print('Runner manifest: ${event.runner.manifest}');
}
}
}
Text Generation #
Simple Generation #
final response = await conversation.generateText('What is the capital of France?');
print(response); // "The capital of France is Paris."
Streaming Generation #
Stream tokens as they're generated for real-time display:
final message = ChatMessage.user('Tell me a story.');
await for (final event in conversation.generateResponse(message)) {
switch (event) {
case GenerationChunkEvent(:final chunk):
stdout.write(chunk); // Print token immediately
case GenerationCompleteEvent(:final stats):
print('\n${stats?.tokensPerSecond?.toStringAsFixed(1)} tokens/sec');
case GenerationErrorEvent(:final error):
print('Error: $error');
default:
break;
}
}
Generation Options #
Fine-tune generation with sampling parameters:
final options = GenerationOptions(
temperature: 0.7, // Creativity (0.0-2.0)
topP: 0.9, // Nucleus sampling
topK: 40, // Top-K sampling
maxTokens: 256, // Maximum output length
);
final response = await conversation.generateText(
'Write a haiku.',
options: options,
);
Structured Output #
Generate JSON that conforms to a schema with automatic validation:
// Define the expected output structure
final recipeSchema = JsonSchema.object('A cooking recipe')
.addString('name', 'The recipe name')
.addArray('ingredients', 'List of ingredients',
items: StringProperty(description: 'An ingredient'))
.addInt('prepTime', 'Preparation time in minutes', minimum: 1)
.addInt('cookTime', 'Cooking time in minutes', minimum: 0)
.addObject('nutrition', 'Nutritional information',
configureNested: (b) => b
.addInt('calories', 'Calories per serving')
.addNumber('protein', 'Protein in grams'))
.build();
// Generate structured output
final message = ChatMessage.user('Give me a recipe for chocolate chip cookies.');
await for (final event in conversation.generateStructured(
message,
schema: recipeSchema,
fromJson: Recipe.fromJson,
)) {
switch (event) {
case StructuredProgressEvent(:final tokenCount):
print('Generating... ($tokenCount tokens)');
case StructuredCompleteEvent<Recipe>(:final result):
print('Recipe: ${result.name}');
print('Ingredients: ${result.ingredients.join(", ")}');
print('Calories: ${result.nutrition.calories}');
case StructuredErrorEvent(:final error, :final rawResponse):
print('Failed: $error');
}
}
Schema Types #
The schema builder supports these property types:
| Method | JSON Type | Options |
|---|---|---|
addString |
string |
enumValues, minLength, maxLength |
addInt |
integer |
minimum, maximum |
addNumber |
number |
minimum, maximum |
addBool |
boolean |
- |
addArray |
array |
items, minItems, maxItems |
addObject |
object |
configureNested |
Function Calling #
Define tools the model can invoke to extend its capabilities:
// Define a function with typed parameters
final searchFunction = LeapFunction.withSchema(
name: 'search_web',
description: 'Search the web for current information',
schema: JsonSchema.object('Search parameters')
.addString('query', 'The search query')
.addInt('limit', 'Maximum results', required: false, minimum: 1, maximum: 10)
.build(),
);
// Register with the conversation
await conversation.registerFunction(searchFunction);
// Handle function calls during generation
await for (final event in conversation.generateResponse(message)) {
switch (event) {
case GenerationFunctionCallEvent(:final functionCalls):
for (final call in functionCalls) {
print('Calling ${call.name} with ${call.arguments}');
// Execute your function
final result = await executeSearch(call.arguments);
// Return the result to continue generation
await conversation.provideFunctionResult(
LeapFunctionResult(callId: call.id, result: result),
);
}
case GenerationChunkEvent(:final chunk):
stdout.write(chunk);
default:
break;
}
}
Vision Models #
Analyze images with multimodal vision-language models:
// Load a vision model from the catalog
final visionModel = ModelCatalog.findBySlug('LFM2.5-VL-1.6B')!;
await for (final event in liquidAi.loadModel(
visionModel.slug,
visionModel.defaultQuantization.slug, // Q8_0 for vision models
)) {
if (event is LoadCompleteEvent) {
runner = event.runner;
}
}
// Create a conversation and send an image
final conversation = await runner.createConversation();
// Load image as JPEG bytes
final imageBytes = await File('photo.jpg').readAsBytes();
final message = ChatMessage(
role: ChatMessageRole.user,
content: [
ImageContent(data: imageBytes),
TextContent(text: 'Describe what you see in this image.'),
],
);
await for (final event in conversation.generateResponse(message)) {
if (event is GenerationChunkEvent) {
stdout.write(event.chunk);
}
}
Model Catalog #
Browse available models programmatically:
// All available (non-deprecated) models
final models = ModelCatalog.available;
// Filter by capability
final visionModels = ModelCatalog.visionModels;
final reasoningModels = ModelCatalog.byTask(ModelTask.reasoning);
final japaneseModels = ModelCatalog.byLanguage('ja');
// Find a specific model
final model = ModelCatalog.findBySlug('LFM2.5-1.2B-Instruct');
if (model != null) {
print('${model.name} - ${model.parameters} parameters');
print('Context: ${model.contextLength} tokens');
// Access available quantizations
for (final quant in model.quantizations) {
print(' ${quant.quantization.name}: ${quant.slug}');
}
// Get the recommended default quantization
print('Default: ${model.defaultQuantization.slug}');
}
Available Models #
| Model | Parameters | Task | Modalities |
|---|---|---|---|
| LFM2.5-1.2B-Instruct | 1.2B | General | Text |
| LFM2.5-1.2B-Thinking | 1.2B | Reasoning | Text |
| LFM2.5-VL-1.6B | 1.6B | General | Text, Image |
| LFM2-2.6B | 2.6B | General | Text |
| LFM2-2.6B-Exp | 2.6B | Reasoning | Text |
| LFM2-VL-3B | 3B | General | Text, Image |
| LFM2-350M | 350M | General | Text |
| LFM2-700M | 700M | General | Text |
See ModelCatalog.all for the complete list including specialized models for extraction, translation, and summarization.
Quantization Options #
Models are available in multiple quantization levels via the ModelQuantization enum:
| Enum | Slug | Size | Quality | Use Case |
|---|---|---|---|---|
ModelQuantization.q4_0 |
Q4_0 |
Smallest | Good | Mobile devices, fast inference |
ModelQuantization.q4KM |
Q4_K_M |
Small | Better | Balanced quality and size |
ModelQuantization.q5KM |
Q5_K_M |
Medium | High | Quality-focused applications |
ModelQuantization.q8_0 |
Q8_0 |
Large | Highest | Maximum quality |
ModelQuantization.f16 |
F16 |
Largest | Reference | Vision models only |
Error Handling #
Handle errors gracefully with typed exceptions:
try {
final response = await conversation.generateText('...');
} on LiquidAiException catch (e) {
print('SDK error: ${e.message}');
} on StateError catch (e) {
print('Invalid state: ${e.message}'); // e.g., disposed conversation
}
Common error scenarios:
- Model not found - Invalid model slug or quantization
- Download failed - Network issues during model download
- Out of memory - Model too large for device
- Context exceeded - Conversation history too long
- Generation cancelled - User or timeout cancellation
Conversation Management #
System Prompts #
Set context for the conversation:
final conversation = await runner.createConversation(
systemPrompt: 'You are a helpful coding assistant. Respond concisely.',
);
Conversation History #
Access and restore conversation state:
// Get current history
final history = await conversation.getHistory();
// Export conversation
final json = await conversation.export();
// Create from existing history
final restored = await runner.createConversationFromHistory(history);
Clear History #
Reset the conversation while keeping it active:
// Clear history but keep the system prompt
await conversation.clearHistory();
// Clear everything including system prompt
await conversation.clearHistory(keepSystemPrompt: false);
Fork Conversations #
Create independent copies for exploring different conversation branches:
// Create a checkpoint before trying something
final checkpoint = await conversation.fork();
// Try something in the original conversation
await conversation.generateText('Tell me about quantum physics');
// Use the checkpoint to explore a different path
await checkpoint.generateText('Tell me about biology');
// Both conversations now have different histories
// Don't forget to dispose the forked conversation when done
await checkpoint.dispose();
Token Counting #
Monitor context usage (iOS only):
final tokens = await conversation.getTokenCount();
if (tokens > 4000) {
print('Warning: Approaching context limit');
}
Resource Management #
Basic Cleanup #
Always dispose of resources when done:
// Dispose in reverse order of creation
await conversation.dispose();
await runner.dispose();
// Or use try/finally
try {
final conversation = await runner.createConversation();
// Use conversation...
} finally {
await conversation.dispose();
}
ModelManager for Single-Model Apps #
For apps that load only one model at a time, use ModelManager to automatically manage model lifecycle:
final manager = ModelManager.instance;
// Load a model (automatically unloads any previous model)
final runner = await manager.loadModelAsync('LFM2.5-1.2B-Instruct', 'Q4_K_M');
// Check what's loaded
print('Loaded: ${manager.currentModelSlug}');
print('Has model: ${manager.hasLoadedModel}');
// Load a different model (previous one is automatically unloaded first)
final newRunner = await manager.loadModelAsync('LFM2-2.6B', 'Q4_K_M');
// Explicitly unload when done
await manager.unloadCurrentModel();
Hot-Reload Recovery #
During Flutter hot-reload, Dart state is reset but native state persists. Use syncWithNative() to recover the loaded model state:
// In your app initialization
Future<void> initializeApp() async {
final manager = ModelManager.instance;
// Sync Dart state with native state
final wasModelLoaded = await manager.syncWithNative();
if (wasModelLoaded) {
print('Recovered loaded model: ${manager.currentModelSlug}');
// The runner is available at manager.currentRunner
}
}
This is especially important for state management solutions like Provider:
class AppState extends ChangeNotifier {
final _modelManager = ModelManager.instance;
Future<void> initialize() async {
// Recover model state after hot-reload
await _modelManager.syncWithNative();
if (_modelManager.hasLoadedModel) {
// Update UI state to reflect loaded model
notifyListeners();
}
}
}
Loading Models from Local Paths #
Load models from custom locations (useful for bundled or custom models):
// Using ModelManager
final runner = await ModelManager.instance.loadModelFromPathAsync(
'/path/to/model.gguf',
options: LoadOptions(contextSize: 2048),
);
// Check if loaded from path
if (ModelManager.instance.isCurrentModelPathLoaded) {
print('Model path: ${ModelManager.instance.currentPath}');
}
API Reference #
For complete API documentation, see the API Reference.
Key classes:
LiquidAi- Main entry point for model managementModelRunner- A loaded model ready for inferenceModelManager- Singleton for single-model lifecycle managementModelManifest- Extended metadata for loaded modelsConversation- Chat session with historyJsonSchema- Schema builder for structured outputLeapFunction- Function definition for tool useModelCatalog- Model discovery and filtering
Examples #
For a comprehensive example covering all features, see example/example.dart.
The example/ directory also contains a full Flutter demo app demonstrating:
- Model selection and downloading
- Chat interface with streaming
- Structured output demos
- Function calling examples
- Settings and configuration
Contributing #
Contributions are welcome! Please read our contributing guidelines before submitting a pull request.
License #
MIT License - see the LICENSE file for details.