Xybrid Flutter SDK

Run LLMs, ASR, and TTS natively in Flutter apps — private, offline, no cloud required.

pub package License: Apache 2.0

Installation

flutter pub add xybrid_flutter

Or add to your pubspec.yaml:

dependencies:
  xybrid_flutter: ^0.1.0
Alternative installation (git / local path)

From git (unreleased changes):

dependencies:
  xybrid_flutter:
    git:
      url: https://github.com/xybrid-ai/xybrid.git
      ref: main
      path: bindings/flutter

Quick Start

import 'package:xybrid_flutter/xybrid_flutter.dart';

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();
  await Xybrid.init();

  // Load a TTS model from the registry
  final model = await XybridModelLoader.fromRegistry('kokoro-82m').load();

  // Run text-to-speech
  final result = await model.run(XybridEnvelope.text('Hello from Xybrid!'));
  print('Audio: ${result.audioBytes?.length} bytes');
}

Features

Model Loading

Load models from the Xybrid registry or local bundles:

// From registry (downloads + caches automatically)
final model = await XybridModelLoader.fromRegistry('kokoro-82m').load();

// From local bundle
final model = await XybridModelLoader.fromBundle('path/to/model.xyb').load();

// Check if already cached
if (Xybrid.isModelCached('kokoro-82m')) {
  print('Model ready, no download needed');
}

Download Progress

Track model downloads with progress events:

final loader = XybridModelLoader.fromRegistry('kokoro-82m');

await for (final event in loader.loadWithProgress()) {
  switch (event) {
    case LoadProgress(:final progress):
      print('Downloading: ${(progress * 100).toInt()}%');
    case LoadComplete():
      print('Model ready!');
    case LoadError(:final message):
      print('Error: $message');
  }
}

Input Envelopes

Type-safe inputs for different model types:

// Text (for TTS or LLM)
final textInput = XybridEnvelope.text('Hello world');

// Text with TTS voice selection
final ttsInput = XybridEnvelope.text('Hello', voiceId: 'af_heart', speed: 1.0);

// Audio (for ASR / speech-to-text)
final audioInput = XybridEnvelope.audio(
  bytes: wavBytes,
  sampleRate: 16000,
  channels: 1,
);

// Embedding vector
final embeddingInput = XybridEnvelope.embedding([0.1, 0.2, 0.3]);

Inference Results

final result = await model.run(envelope);

if (result.success) {
  // Text output (ASR transcription or LLM response)
  print(result.text);

  // Audio output (TTS) — get as WAV for playback
  final wav = result.audioAsWav(sampleRate: 24000, channels: 1);

  // Embedding output
  print(result.embedding);

  // Inference timing
  print('Latency: ${result.latencyMs}ms');
}

LLM Streaming

Stream tokens in real-time as the LLM generates:

final model = await XybridModelLoader.fromRegistry('qwen-2.5-0.5b').load();

await for (final token in model.runStreaming(XybridEnvelope.text('What is ML?'))) {
  stdout.write(token.token);

  if (token.isFinal) {
    print('\n--- Done (${token.finishReason}) ---');
  }
}

Conversation Memory

Multi-turn LLM conversations with automatic history management:

final model = await XybridModelLoader.fromRegistry('qwen-2.5-0.5b').load();
final context = ConversationContext();
context.setSystem('You are a helpful assistant.');

// Turn 1
context.pushText('What is Rust?', MessageRole.user);
final result = await model.runWithContext(
  XybridEnvelope.text('What is Rust?'),
  context,
);
context.pushText(result.text!, MessageRole.assistant);

// Turn 2 — the model remembers Turn 1
context.pushText('How does it compare to Go?', MessageRole.user);
final result2 = await model.runWithContext(
  XybridEnvelope.text('How does it compare to Go?'),
  context,
);

Streaming with context also supported:

await for (final token in model.runStreamingWithContext(envelope, context)) {
  stdout.write(token.token);
}

Pipeline Execution

Run multi-stage ML pipelines from YAML:

final pipeline = XybridPipeline.fromYaml('''
name: "Speech-to-Text"
stages:
  - "whisper-tiny@1.0"
''');

print('Pipeline: ${pipeline.name}, ${pipeline.stageCount} stages');

final result = await pipeline.run(XybridEnvelope.audio(
  bytes: audioBytes,
  sampleRate: 16000,
));
print('Transcription: ${result.text}');

Platform Support

Platform ONNX Runtime Candle LLM (llama.cpp) Notes
macOS ✅ Metal Apple Silicon only (M1+)
iOS ✅ CoreML ✅ Metal arm64, downloads ORT from HuggingFace
Android arm64-v8a, x86_64; ORT from Maven Central
Linux ✅ CPU x86_64
Windows ✅ CPU x86_64

Model Support

Model Type All Platforms
Kokoro 82M TTS
KittenTTS Nano TTS
Whisper Tiny (Candle) ASR
Wav2Vec2 (ONNX) ASR
SmolLM2 360M LLM
Qwen 2.5 0.5B LLM
Qwen 3.5 0.8B LLM
Qwen 3.5 2B LLM
Gemma 3 1B LLM
Llama 3.2 1B LLM

Platform Requirements

  • macOS: 13.3+, Xcode 15+, Apple Silicon
  • iOS: 13.0+, Xcode 15+
  • Android: minSdk 21, NDK r25+, 64-bit only
  • Linux/Windows: x86_64

Native Libraries

Native ML runtimes are resolved automatically at build time:

  • Android: ONNX Runtime pulled from Maven Central (com.microsoft.onnxruntime:onnxruntime-android)
  • iOS: ONNX Runtime xcframework downloaded from HuggingFace and cached at ~/.xybrid/cache/ort-ios/
  • macOS/Linux/Windows: ONNX Runtime downloaded by the ort Rust crate at compile time

Precompiled Rust binaries are available for all platforms via cargokit — no Rust toolchain required for most users.

Example App

A full example app with 8 demo screens (TTS, ASR, LLM chat, pipelines, device info) is available:

https://github.com/xybrid-ai/xybrid/tree/main/examples/flutter

API Reference

Class Purpose
Xybrid SDK initialization, cache checking, factory methods
XybridModelLoader Load models from registry or local bundle
XybridModel Run inference (batch, streaming, with context)
XybridEnvelope Type-safe inputs: audio, text, embedding
XybridResult Inference output: text, audio, embedding, latency
StreamToken Individual LLM token during streaming
ConversationContext Multi-turn conversation history with FIFO pruning
XybridPipeline Multi-stage pipeline execution from YAML
MessageRole Enum: system, user, assistant
LoadEvent Download progress events: LoadProgress, LoadComplete, LoadError

Full API documentation: pub.dev/documentation/xybrid_flutter

License

Apache 2.0 — see LICENSE

Libraries

xybrid
Xybrid - Hybrid cloud-edge ML inference orchestrator.
xybrid_flutter
Xybrid Flutter SDK for hybrid cloud-edge ML inference.