Xybrid Flutter SDK
Run ML models on-device or in the cloud with intelligent hybrid routing and streaming inference.
Installation
flutter pub add xybrid_flutter
Or add to your pubspec.yaml:
dependencies:
xybrid_flutter: ^0.1.0
Alternative installation (git / local path)
From git (unreleased changes):
dependencies:
xybrid_flutter:
git:
url: https://github.com/xybrid-ai/xybrid.git
ref: main
path: bindings/flutter
Quick Start
import 'package:xybrid_flutter/xybrid_flutter.dart';
Future<void> main() async {
WidgetsFlutterBinding.ensureInitialized();
await Xybrid.init();
// Load a TTS model from the registry
final model = await XybridModelLoader.fromRegistry('kokoro-82m').load();
// Run text-to-speech
final result = await model.run(XybridEnvelope.text('Hello from Xybrid!'));
print('Audio: ${result.audioBytes?.length} bytes');
}
Features
Model Loading
Load models from the Xybrid registry or local bundles:
// From registry (downloads + caches automatically)
final model = await XybridModelLoader.fromRegistry('kokoro-82m').load();
// From local bundle
final model = await XybridModelLoader.fromBundle('path/to/model.xyb').load();
// Check if already cached
if (Xybrid.isModelCached('kokoro-82m')) {
print('Model ready, no download needed');
}
Download Progress
Track model downloads with progress events:
final loader = XybridModelLoader.fromRegistry('kokoro-82m');
await for (final event in loader.loadWithProgress()) {
switch (event) {
case LoadProgress(:final progress):
print('Downloading: ${(progress * 100).toInt()}%');
case LoadComplete():
print('Model ready!');
case LoadError(:final message):
print('Error: $message');
}
}
Input Envelopes
Type-safe inputs for different model types:
// Text (for TTS or LLM)
final textInput = XybridEnvelope.text('Hello world');
// Text with TTS voice selection
final ttsInput = XybridEnvelope.text('Hello', voiceId: 'af_heart', speed: 1.0);
// Audio (for ASR / speech-to-text)
final audioInput = XybridEnvelope.audio(
bytes: wavBytes,
sampleRate: 16000,
channels: 1,
);
// Embedding vector
final embeddingInput = XybridEnvelope.embedding([0.1, 0.2, 0.3]);
Inference Results
final result = await model.run(envelope);
if (result.success) {
// Text output (ASR transcription or LLM response)
print(result.text);
// Audio output (TTS) — get as WAV for playback
final wav = result.audioAsWav(sampleRate: 24000, channels: 1);
// Embedding output
print(result.embedding);
// Inference timing
print('Latency: ${result.latencyMs}ms');
}
LLM Streaming
Stream tokens in real-time as the LLM generates:
final model = await XybridModelLoader.fromRegistry('qwen-2.5-0.5b').load();
await for (final token in model.runStreaming(XybridEnvelope.text('What is ML?'))) {
stdout.write(token.token);
if (token.isFinal) {
print('\n--- Done (${token.finishReason}) ---');
}
}
Conversation Memory
Multi-turn LLM conversations with automatic history management:
final model = await XybridModelLoader.fromRegistry('qwen-2.5-0.5b').load();
final context = ConversationContext();
context.setSystem('You are a helpful assistant.');
// Turn 1
context.pushText('What is Rust?', MessageRole.user);
final result = await model.runWithContext(
XybridEnvelope.text('What is Rust?'),
context,
);
context.pushText(result.text!, MessageRole.assistant);
// Turn 2 — the model remembers Turn 1
context.pushText('How does it compare to Go?', MessageRole.user);
final result2 = await model.runWithContext(
XybridEnvelope.text('How does it compare to Go?'),
context,
);
Streaming with context also supported:
await for (final token in model.runStreamingWithContext(envelope, context)) {
stdout.write(token.token);
}
Pipeline Execution
Run multi-stage ML pipelines from YAML:
final pipeline = XybridPipeline.fromYaml('''
name: "Speech-to-Text"
stages:
- "whisper-tiny@1.0"
''');
print('Pipeline: ${pipeline.name}, ${pipeline.stageCount} stages');
final result = await pipeline.run(XybridEnvelope.audio(
bytes: audioBytes,
sampleRate: 16000,
));
print('Transcription: ${result.text}');
Platform Support
| Platform | ONNX Runtime | Candle | LLM (llama.cpp) | Notes |
|---|---|---|---|---|
| macOS | ✅ | ✅ Metal | ✅ | Apple Silicon only (M1+) |
| iOS | ✅ CoreML | ✅ Metal | ✅ | arm64, downloads ORT from HuggingFace |
| Android | ✅ | — | ✅ | arm64-v8a, x86_64; ORT from Maven Central |
| Linux | ✅ | ✅ CPU | ✅ | x86_64 |
| Windows | ✅ | ✅ CPU | ✅ | x86_64 |
Model Support
| Model | Type | All Platforms |
|---|---|---|
| Kokoro-82M | TTS | ✅ |
| KittenTTS Nano | TTS | ✅ |
| Whisper Tiny (Candle) | ASR | ✅ |
| Wav2Vec2 (ONNX) | ASR | ✅ |
| Qwen 2.5 0.5B | LLM | ✅ |
Platform Requirements
- macOS: 13.3+, Xcode 15+, Apple Silicon
- iOS: 13.0+, Xcode 15+
- Android: minSdk 21, NDK r25+, 64-bit only
- Linux/Windows: x86_64
Native Libraries
Native ML runtimes are resolved automatically at build time:
- Android: ONNX Runtime pulled from Maven Central (
com.microsoft.onnxruntime:onnxruntime-android) - iOS: ONNX Runtime xcframework downloaded from HuggingFace and cached at
~/.xybrid/cache/ort-ios/ - macOS/Linux/Windows: ONNX Runtime downloaded by the
ortRust crate at compile time
Precompiled Rust binaries are available for all platforms via cargokit — no Rust toolchain required for most users.
Example App
A full example app with 8 demo screens (TTS, ASR, LLM chat, pipelines, device info) is available:
https://github.com/xybrid-ai/xybrid/tree/main/examples/flutter
API Reference
| Class | Purpose |
|---|---|
Xybrid |
SDK initialization, cache checking, factory methods |
XybridModelLoader |
Load models from registry or local bundle |
XybridModel |
Run inference (batch, streaming, with context) |
XybridEnvelope |
Type-safe inputs: audio, text, embedding |
XybridResult |
Inference output: text, audio, embedding, latency |
StreamToken |
Individual LLM token during streaming |
ConversationContext |
Multi-turn conversation history with FIFO pruning |
XybridPipeline |
Multi-stage pipeline execution from YAML |
MessageRole |
Enum: system, user, assistant |
LoadEvent |
Download progress events: LoadProgress, LoadComplete, LoadError |
Full API documentation: pub.dev/documentation/xybrid_flutter
License
Apache 2.0 — see LICENSE
Libraries
- xybrid
- Xybrid - Hybrid cloud-edge ML inference orchestrator.
- xybrid_flutter
- Xybrid Flutter SDK for hybrid cloud-edge ML inference.