xybrid_flutter 0.1.0-alpha5
xybrid_flutter: ^0.1.0-alpha5 copied to clipboard
Xybrid Flutter SDK — run ML models on-device or in the cloud with intelligent hybrid routing and streaming inference.
Xybrid Flutter SDK #
Flutter SDK for the Xybrid universal AI inference runtime. Execute AI pipelines on-device with intelligent local/cloud routing.
Installation #
From pub.dev (recommended) #
flutter pub add xybrid_flutter
Or add to your pubspec.yaml:
dependencies:
xybrid_flutter: ^0.1.0
From Git (unreleased changes) #
dependencies:
xybrid_flutter:
git:
url: https://github.com/xybrid-ai/xybrid.git
ref: main
path: bindings/flutter
From local path (development) #
dependencies:
xybrid_flutter:
path: ../../bindings/flutter
Features #
- Pipeline Execution: Run AI inference pipelines defined in YAML
- Audio Processing: Record audio and run speech-to-text models
- Real-time Streaming: Real-time ASR with
XybridStreamer - Real-time Telemetry: Stream pipeline execution events for monitoring
- Envelope System: Type-safe data passing (Audio, Text, Embedding)
- Device Intelligence: Hardware capabilities detection (GPU, NPU, memory, thermal)
- Cross-platform: Supports iOS, macOS, Android, Linux, and Windows
Platform Requirements #
macOS
- macOS 13.3+ (for ONNX Runtime CoreML)
- Xcode 15+
- Apple Silicon only (M1/M2/M3/M4) - Intel Macs are not supported
- Supports: ONNX Runtime + Candle (Metal acceleration)
iOS
- iOS 13.0+
- Xcode 15+
- Supports: Candle only (Metal acceleration)
- Note: ONNX Runtime requires manual xcframework bundling (see PLAN-IOS-ONNX.md)
Android
- NDK r25+ (r27 recommended)
- minSdkVersion 21
- 64-bit only: arm64-v8a, x86_64 (32-bit excluded for APK size)
- Supports: ONNX Runtime (bundled .so from Microsoft AAR)
Linux / Windows
- Supports: ONNX Runtime (download-binaries)
Rust Toolchain #
# Install Rust if not already installed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install required targets
# macOS (Apple Silicon only)
rustup target add aarch64-apple-darwin
# iOS
rustup target add aarch64-apple-ios aarch64-apple-ios-sim
# Android
rustup target add aarch64-linux-android x86_64-linux-android
# Linux (if cross-compiling)
rustup target add x86_64-unknown-linux-gnu
# Windows (if cross-compiling)
rustup target add x86_64-pc-windows-msvc
Quick Start #
Initialize the SDK #
import 'package:xybrid_flutter/xybrid_flutter.dart';
void main() async {
WidgetsFlutterBinding.ensureInitialized();
await RustLib.init();
// Initialize Rust logging - logs will appear in:
// - Android: Logcat with tag "xybrid"
// - iOS: Xcode console
// - Desktop: stderr (terminal)
initLogging();
runApp(MyApp());
}
Run a Pipeline #
// Load pipeline YAML from assets
final yamlContent = await rootBundle.loadString('assets/pipelines/speech-to-text.yaml');
// Run without input
final result = runPipelineFromYaml(yamlContent: yamlContent, envelope: null);
// Parse result (JSON)
final data = jsonDecode(result);
print('Pipeline: ${data['name']}');
print('Total latency: ${data['total_latency_ms']}ms');
Run with Audio Input #
// Create audio envelope from recorded bytes
final audioBytes = await readAudioFile(recordingPath);
final envelope = Envelope.audio(
audioBytes: audioBytes,
sampleRate: 16000,
channels: 1,
);
// Run pipeline with audio input
final result = runPipelineFromYaml(
yamlContent: yamlContent,
envelope: envelope,
);
AudioEnvelope API (v0.0.7) #
The AudioEnvelope class provides a high-level API for audio processing with automatic format conversion, resampling, and WAV parsing.
From WAV File
import 'dart:io';
import 'package:xybrid_flutter/xybrid_flutter.dart';
// Load WAV file
final wavBytes = await File('recording.wav').readAsBytes();
final audio = AudioEnvelope.fromWav(wavBytes: wavBytes);
// Check audio properties
print('Sample rate: ${audio.sampleRate}Hz');
print('Channels: ${audio.channels}');
print('Duration: ${audio.durationMs}ms');
// Convert to pipeline Envelope and run inference
final envelope = audio.toEnvelope();
final result = runInferenceFromBundle(
bundlePath: 'models/wav2vec2.xyb',
envelope: envelope,
);
From Raw Recording (PCM16)
// From mobile audio recorder (e.g., record package)
final audio = AudioEnvelope.fromRecording(
pcmBytes: recorder.getBytes(), // Raw PCM16 bytes
sampleRate: 44100, // Your recording's sample rate
channels: 2, // Stereo recording
);
// Prepare for ASR (converts to 16kHz mono float32)
final prepared = audio.prepareForAsr();
// Run inference
final result = runInferenceFromBundle(
bundlePath: 'models/wav2vec2.xyb',
envelope: prepared.toEnvelope(),
);
From Float32 Samples
// If you already have normalized float samples (-1.0 to 1.0)
final audio = AudioEnvelope.fromSamples(
samples: floatSamples,
sampleRate: 16000,
channels: 1,
);
Audio Transformations
// Convert stereo to mono
final mono = audio.toMono();
// Resample to different rate
final resampled = audio.resample(targetRate: 16000);
// Prepare for ASR (convenience: mono + 16kHz)
final prepared = audio.prepareForAsr();
// Get raw samples for custom processing
final Float32List samples = audio.samples();
// Get format info
final AudioFormat format = audio.format();
print('Format: ${format.formatType}'); // "float32", "pcm16", etc.
Full Example: Voice Recording to Transcription
import 'package:xybrid_flutter/xybrid_flutter.dart';
Future<String> transcribeRecording(Uint8List pcmBytes) async {
// 1. Create AudioEnvelope from recording
final audio = AudioEnvelope.fromRecording(
pcmBytes: pcmBytes,
sampleRate: 44100, // Common mobile recording rate
channels: 2, // Stereo
);
// 2. Prepare for ASR (auto-converts to 16kHz mono)
final prepared = audio.prepareForAsr();
// 3. Convert to pipeline Envelope
final envelope = prepared.toEnvelope();
// 4. Run inference
final xybrid = Xybrid.simple();
final result = xybrid.runInferenceFromBundle(
'assets/models/wav2vec2.xyb',
envelope: envelope,
);
if (result.success) {
return result.text ?? '';
} else {
throw XybridException(result.error ?? 'Inference failed');
}
}
Stream Telemetry Events #
// Subscribe to real-time pipeline events
final stream = subscribeTelemetryEvents();
stream.listen((event) {
print('${event.eventType}: ${event.stageName ?? "pipeline"}');
if (event.latencyMs != null) {
print(' Latency: ${event.latencyMs}ms');
}
});
API Reference #
Pipeline Execution #
/// Run a pipeline from YAML content
String runPipelineFromYaml({
required String yamlContent,
Envelope? envelope,
});
/// Run a pipeline from file path
String runPipeline(String configPath);
/// Get model cache status
String getCacheStatus();
Envelope Types #
/// Audio envelope (PCM bytes)
Envelope.audio(
audioBytes: Uint8List,
sampleRate: int?, // default: 16000
channels: int?, // default: 1
);
/// Text envelope
Envelope.text(String text);
/// Embedding envelope
Envelope.embedding(List<double> values);
AudioEnvelope (v0.0.7) #
/// Create from WAV file bytes
AudioEnvelope.fromWav({required List<int> wavBytes});
/// Create from raw PCM16 recording
AudioEnvelope.fromRecording({
required List<int> pcmBytes,
required int sampleRate,
required int channels,
});
/// Create from float32 samples
AudioEnvelope.fromSamples({
required List<double> samples,
required int sampleRate,
required int channels,
});
/// Properties
int get sampleRate; // Sample rate in Hz
int get channels; // Number of channels
BigInt get sampleCount; // Number of samples
double get durationMs; // Duration in milliseconds
AudioFormat format(); // Format info
/// Transformations
AudioEnvelope toMono(); // Convert to mono
AudioEnvelope resample({required int targetRate}); // Resample
AudioEnvelope prepareForAsr(); // Convert to 16kHz mono
/// Conversion
Envelope toEnvelope(); // Convert to pipeline Envelope
Float32List samples(); // Get raw float32 samples
String toJson(); // Export as JSON (debugging)
AudioFormat #
/// Standard audio format specification
AudioFormat.pcm16({required int sampleRate, required int channels});
AudioFormat.float32({required int sampleRate, required int channels});
AudioFormat.asrDefault(); // 16kHz mono float32
Device Capabilities (v0.0.6) #
/// Get current device hardware capabilities
HardwareCapabilities getDeviceCapabilities();
/// Check if a model can be loaded (memory check)
bool canLoadModel(int modelSizeMb);
The HardwareCapabilities class provides:
class HardwareCapabilities {
bool hasGpu; // GPU available
String gpuType; // metal, vulkan, directx, none
bool hasMetal; // iOS/macOS Metal
bool hasNnapi; // Android NNAPI
bool hasNpu; // Neural Processing Unit
String npuType; // coreml, nnapi, directml, none
int memoryAvailableMb; // Available RAM
int memoryTotalMb; // Total RAM
int batteryLevel; // 0-100
String thermalState; // normal, warm, hot, critical
String platform; // macos, ios, android, linux, windows
bool shouldThrottle; // true if battery low or thermal critical
// v0.0.7 additions
double cpuUsagePercent; // Current CPU usage (0-100)
int cpuCores; // Number of CPU cores
String memoryConfidence; // Detection confidence: high, medium, low
String gpuConfidence; // Detection confidence: high, medium, low
String npuConfidence; // Detection confidence: high, medium, low
}
Example usage:
final caps = getDeviceCapabilities();
print('Platform: ${caps.platform}');
print('GPU: ${caps.hasGpu ? caps.gpuType : "none"}');
print('NPU: ${caps.hasNpu ? caps.npuType : "none"}');
print('Memory: ${caps.memoryAvailableMb}/${caps.memoryTotalMb} MB');
if (caps.shouldThrottle) {
// Use cloud inference or smaller models
print('Device is throttled - battery: ${caps.batteryLevel}%, thermal: ${caps.thermalState}');
}
// v0.0.7: Check detection confidence
print('Memory confidence: ${caps.memoryConfidence}'); // high, medium, or low
print('CPU: ${caps.cpuCores} cores, ${caps.cpuUsagePercent}% usage');
Session Metrics (v0.0.6) #
/// Get current session metrics
SessionMetrics getSessionMetrics();
/// Export telemetry as JSON for backend ingestion
String exportTelemetryJson();
/// Reset session and start fresh
void resetSession();
/// Record an inference call (usually automatic)
void recordInference(String modelId, String version, int latencyMs);
/// Record an error (usually automatic)
void recordError(String? modelId, String error);
/// End session and get final export
String endSession();
The SessionMetrics class provides:
class SessionMetrics {
String sessionId; // Unique session ID
String deviceId; // Device identifier
int startedAt; // Session start (ms since epoch)
int endedAt; // Session end (0 if active)
String sdkVersion; // SDK version
int totalInferences; // Total inference calls
int totalLatencyMs; // Cumulative latency
int avgLatencyMs; // Average latency
List<String> modelsUsed; // Models used in session
int errorCount; // Total errors
int durationMs; // Session duration
bool isActive; // Is session active
}
Example usage:
// Get current metrics
final metrics = getSessionMetrics();
print('Inferences: ${metrics.totalInferences}');
print('Avg Latency: ${metrics.avgLatencyMs}ms');
print('Models: ${metrics.modelsUsed.join(", ")}');
// Export for analytics
final json = exportTelemetryJson();
await http.post(analyticsEndpoint, body: json);
// On app exit
final finalReport = endSession();
await sendToAnalytics(finalReport);
Telemetry Events #
class TelemetryEvent {
String eventType; // PipelineStart, StageComplete, etc.
String? stageName; // Stage name if applicable
String? target; // local, cloud, or fallback
int? latencyMs; // Execution time
String? error; // Error message if failed
String? data; // Additional JSON data
int timestampMs; // Unix timestamp
}
Event types:
PipelineStart/PipelineCompleteStageStart/StageCompleteRoutingDecidedStageError/ExecutionFailed
Example App #
The example app demonstrates all SDK features:
cd xybrid_flutter/example
flutter run -d macos # or -d ios, -d android
Features:
- Pipeline selection (Speech-to-Text, Hiiipe)
- Audio recording with mic input
- Real-time telemetry viewer
- Device capabilities display (GPU, NPU, memory, thermal)
- Cache status monitoring
Pipeline YAML Format #
name: "Speech-to-Text Demo"
registry:
url: "http://0.0.0.0:8080"
stages:
- "whisper-tiny@1.2"
input:
kind: "AudioRaw"
metrics:
network_rtt: 50
battery: 80
temperature: 22.0
availability:
"whisper-tiny@1.2": true
Architecture #
xybrid_flutter/
├── rust/ # Rust FFI layer
│ └── src/api/
│ ├── xybrid.rs # Pipeline execution API
│ ├── envelope.rs # Input data types
│ ├── logging.rs # Rust-to-Flutter logging bridge
│ └── telemetry.rs # Event streaming
├── lib/ # Generated Dart bindings
│ └── src/rust/ # flutter_rust_bridge output
└── example/ # Demo application
├── lib/main.dart # Full-featured demo
├── macos/Runner/*.entitlements # macOS sandbox permissions
└── assets/pipelines/ # Sample YAML configs
Building #
The SDK uses flutter_rust_bridge v2.11.1 for Rust-Dart interop.
# Regenerate bindings (if Rust API changes)
cd xybrid_flutter
flutter_rust_bridge_codegen generate
# Build for development
cd example
flutter build macos --debug
Logging #
The SDK includes a Rust-to-Flutter logging bridge. Initialize it at app startup:
// Initialize after RustLib.init()
initLogging();
// Test logging (optional)
testLogging();
// Log from Dart to Rust logger
logInfo(message: "App started");
logDebug(message: "Debug info");
logWarn(message: "Warning");
logError(message: "Error occurred");
Logs appear in:
- Android: Logcat with tag
xybrid - iOS: Xcode console via oslog
- Desktop: stderr (visible in terminal)
macOS Entitlements #
The example app includes required entitlements for macOS sandbox:
<!-- DebugProfile.entitlements and Release.entitlements -->
<key>com.apple.security.network.client</key>
<true/> <!-- Required for HTTP requests to registry -->
<key>com.apple.security.network.server</key>
<true/> <!-- Required for local server connections -->
<key>com.apple.security.device.audio-input</key>
<true/> <!-- Required for microphone recording -->
Model Support by Platform #
| Model | Type | macOS | iOS | Android | Linux/Windows |
|---|---|---|---|---|---|
| Whisper (Candle) | ASR | ✅ Metal | ✅ Metal | ❌ | ✅ CPU |
| Wav2Vec2 (ONNX) | ASR | ✅ | ❌ | ✅ | ✅ |
| Kokoro-82M (ONNX) | TTS | ✅ | ❌ | ✅ | ✅ |
| KittenTTS (ONNX) | TTS | ✅ | ❌ | ✅ | ✅ |
Notes:
- iOS currently supports Candle models only (no ONNX Runtime prebuilt binaries)
- Android uses bundled ONNX Runtime .so from Microsoft AAR
- macOS supports both ONNX and Candle runtimes
Notes #
- ASR Models: Whisper (via Candle) and Wav2Vec2 (via ONNX) are both supported. Whisper ONNX has decoder issues (see docs/archive/).
- TTS Models: Kokoro-82M and KittenTTS work on all ONNX-supported platforms.
- iOS Limitation: ONNX models require bundling xcframework manually. See PLAN-IOS-ONNX.md for options.
- macOS Intel: Intel Macs (x86_64) are not supported. ONNX Runtime does not provide prebuilt binaries for x86_64-apple-darwin.
- macOS Version Warnings: Linker warnings about macOS version mismatches are expected due to ONNX Runtime build configuration.
- Registry URL: Use
http://localhost:8080for local development. The app needsnetwork.cliententitlement to connect.