llm_runner 0.1.0
llm_runner: ^0.1.0 copied to clipboard
LLM runner for mobile applications
LLM Runner - Run AI Models in Flutter with Rust #
LLM Runner is a Rust-powered library for running local AI models (like TinyLlama, Phi-1.5) in Flutter apps. It handles model downloading, loading, and inference with a simple API.
ð Features #
- Multiple Models: Support for TinyLlama, Phi-1.5, and more
- Automatic Downloads: Models are downloaded automatically when needed
- Local Execution: All processing happens on device
- Memory Efficient: Models are loaded/unloaded as needed
- Simple API: Just a few lines to get started
ðĶ Installation #
Add to your pubspec.yaml:
dependencies:
llm_runner:
git: https://github.com/yourusername/rust_llm_runner.git
ðŊ Quick Start #
import 'package:llm_runner/llm_runner.dart';
// Use a pre-configured model
final response = await LlmRunner.generateText(
model: Models.tinyllama, // Small, fast model
prompt: "Tell me a joke",
);
// Switch to a more powerful model
final mathResponse = await LlmRunner.generateText(
model: Models.mistral7b, // Better at complex tasks
prompt: "Explain quantum computing",
);
// Use your own custom model
final customModel = Models.custom(
name: 'deepseek-ai/deepseek-math-7b-instruct',
minRamMb: 8192,
description: 'Specialized for mathematics',
);
final mathResult = await LlmRunner.generateText(
model: customModel,
prompt: "Solve: âŦxÂēdx",
);
ðą Available Models #
Small Models (4GB+ RAM) #
Models.tinyllama- Fast, lightweightModels.phi2- Good at codingModels.gemma2b- Google's efficient model
Medium Models (6GB+ RAM) #
Models.llama32_3b- Latest Llama 3.2Models.mistral7b- Powerful open-source
Large Models (8GB+ RAM) #
Models.qwen7b- High-quality multilingual
Custom Models #
Use any compatible model:
final myModel = Models.custom(
name: 'organization/model-name',
minRamMb: 6144,
description: 'My custom model',
metadata: {
'type': 'instruct',
'language': 'multilingual',
},
);
ð Model Compatibility #
Models should be:
- GGUF format compatible
- Within device memory constraints
- Properly structured (tokenizer, weights, etc.)
See MODELS.md for a full list of tested models.
ðĶ Advanced Usage #
Model Switching #
Models are automatically downloaded and loaded as needed:
// Use TinyLlama
var response = await LlmRunner.generateText(
model: LlmRunner.tinyllama,
prompt: "Tell me a story",
);
// Switch to Phi-1.5
response = await LlmRunner.generateText(
model: LlmRunner.phi15,
prompt: "Solve: x^2 = 16",
);
Error Handling #
try {
final response = await LlmRunner.generateText(
model: LlmRunner.tinyllama,
prompt: "Hello!",
);
print(response);
} catch (e) {
print('Error: $e');
}
ð How It Works #
-
Model Management: The library automatically handles:
- Model downloading
- Loading into memory
- Efficient switching between models
- Memory cleanup
-
Performance:
- ~50ms per token generation
- ~20 tokens per second
- Automatic memory management
ð Requirements #
- Flutter 3.0 or higher
- iOS 11+ or Android 21+
- ~500MB free storage per model
- ~1GB RAM for model execution
ðĪ Contributing #
Contributions welcome! See CONTRIBUTING.md for guidelines.
ð License #
MIT License - see LICENSE for details