xybrid_flutter 0.1.0-beta9 copy "xybrid_flutter: ^0.1.0-beta9" to clipboard
xybrid_flutter: ^0.1.0-beta9 copied to clipboard

Xybrid Flutter SDK — run ML models on-device or in the cloud with intelligent hybrid routing and streaming inference.

Xybrid Flutter SDK #

Run LLMs, ASR, and TTS natively in Flutter apps — private, offline, no cloud required.

pub package License: Apache 2.0

Installation #

flutter pub add xybrid_flutter

Or add to your pubspec.yaml:

dependencies:
  xybrid_flutter: ^0.1.0
Alternative installation (git / local path)

From git (unreleased changes):

dependencies:
  xybrid_flutter:
    git:
      url: https://github.com/xybrid-ai/xybrid.git
      ref: main
      path: bindings/flutter

Quick Start #

import 'package:xybrid_flutter/xybrid_flutter.dart';

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();
  await Xybrid.init();

  // Load a TTS model from the registry
  final model = await XybridModelLoader.fromRegistry('kokoro-82m').load();

  // Run text-to-speech
  final result = await model.run(XybridEnvelope.text('Hello from Xybrid!'));
  print('Audio: ${result.audioBytes?.length} bytes');
}

Features #

Model Loading #

Load models from the Xybrid registry or local bundles:

// From registry (downloads + caches automatically)
final model = await XybridModelLoader.fromRegistry('kokoro-82m').load();

// From local bundle
final model = await XybridModelLoader.fromBundle('path/to/model.xyb').load();

// Check if already cached
if (Xybrid.isModelCached('kokoro-82m')) {
  print('Model ready, no download needed');
}

Download Progress #

Track model downloads with progress events:

final loader = XybridModelLoader.fromRegistry('kokoro-82m');

await for (final event in loader.loadWithProgress()) {
  switch (event) {
    case LoadProgress(:final progress):
      print('Downloading: ${(progress * 100).toInt()}%');
    case LoadComplete():
      print('Model ready!');
    case LoadError(:final message):
      print('Error: $message');
  }
}

Input Envelopes #

Type-safe inputs for different model types:

// Text (for TTS or LLM)
final textInput = XybridEnvelope.text('Hello world');

// Text with TTS voice selection
final ttsInput = XybridEnvelope.text('Hello', voiceId: 'af_heart', speed: 1.0);

// Audio (for ASR / speech-to-text)
final audioInput = XybridEnvelope.audio(
  bytes: wavBytes,
  sampleRate: 16000,
  channels: 1,
);

// Embedding vector
final embeddingInput = XybridEnvelope.embedding([0.1, 0.2, 0.3]);

Inference Results #

final result = await model.run(envelope);

if (result.success) {
  // Text output (ASR transcription or LLM response)
  print(result.text);

  // Audio output (TTS) — get as WAV for playback
  final wav = result.audioAsWav(sampleRate: 24000, channels: 1);

  // Embedding output
  print(result.embedding);

  // Inference timing
  print('Latency: ${result.latencyMs}ms');
}

LLM Streaming #

Stream tokens in real-time as the LLM generates:

final model = await XybridModelLoader.fromRegistry('qwen-2.5-0.5b').load();

await for (final token in model.runStreaming(XybridEnvelope.text('What is ML?'))) {
  stdout.write(token.token);

  if (token.isFinal) {
    print('\n--- Done (${token.finishReason}) ---');
  }
}

Conversation Memory #

Multi-turn LLM conversations with automatic history management:

final model = await XybridModelLoader.fromRegistry('qwen-2.5-0.5b').load();
final context = ConversationContext();
context.setSystem('You are a helpful assistant.');

// Turn 1
context.pushText('What is Rust?', MessageRole.user);
final result = await model.runWithContext(
  XybridEnvelope.text('What is Rust?'),
  context,
);
context.pushText(result.text!, MessageRole.assistant);

// Turn 2 — the model remembers Turn 1
context.pushText('How does it compare to Go?', MessageRole.user);
final result2 = await model.runWithContext(
  XybridEnvelope.text('How does it compare to Go?'),
  context,
);

Streaming with context also supported:

await for (final token in model.runStreamingWithContext(envelope, context)) {
  stdout.write(token.token);
}

Pipeline Execution #

Run multi-stage ML pipelines from YAML:

final pipeline = XybridPipeline.fromYaml('''
name: "Speech-to-Text"
stages:
  - "whisper-tiny@1.0"
''');

print('Pipeline: ${pipeline.name}, ${pipeline.stageCount} stages');

final result = await pipeline.run(XybridEnvelope.audio(
  bytes: audioBytes,
  sampleRate: 16000,
));
print('Transcription: ${result.text}');

Platform Support #

Platform ONNX Runtime Candle LLM (llama.cpp) Notes
macOS ✅ Metal Apple Silicon only (M1+)
iOS ✅ CoreML ✅ Metal arm64, downloads ORT from HuggingFace
Android arm64-v8a, x86_64; ORT from Maven Central
Linux ✅ CPU x86_64
Windows ✅ CPU x86_64

Model Support #

Model Type All Platforms
Kokoro 82M TTS
KittenTTS Nano TTS
Whisper Tiny (Candle) ASR
Wav2Vec2 (ONNX) ASR
SmolLM2 360M LLM
Qwen 2.5 0.5B LLM
Qwen 3.5 0.8B LLM
Qwen 3.5 2B LLM
Gemma 3 1B LLM
Llama 3.2 1B LLM

Platform Requirements #

  • macOS: 13.3+, Xcode 15+, Apple Silicon
  • iOS: 13.0+, Xcode 15+
  • Android: minSdk 21, NDK r25+, 64-bit only
  • Linux/Windows: x86_64

Native Libraries #

Native ML runtimes are resolved automatically at build time:

  • Android: ONNX Runtime pulled from Maven Central (com.microsoft.onnxruntime:onnxruntime-android)
  • iOS: ONNX Runtime xcframework downloaded from HuggingFace and cached at ~/.xybrid/cache/ort-ios/
  • macOS/Linux/Windows: ONNX Runtime downloaded by the ort Rust crate at compile time

Precompiled Rust binaries are available for all platforms via cargokit — no Rust toolchain required for most users.

Example App #

A full example app with 8 demo screens (TTS, ASR, LLM chat, pipelines, device info) is available:

https://github.com/xybrid-ai/xybrid/tree/main/examples/flutter

API Reference #

Class Purpose
Xybrid SDK initialization, cache checking, factory methods
XybridModelLoader Load models from registry or local bundle
XybridModel Run inference (batch, streaming, with context)
XybridEnvelope Type-safe inputs: audio, text, embedding
XybridResult Inference output: text, audio, embedding, latency
StreamToken Individual LLM token during streaming
ConversationContext Multi-turn conversation history with FIFO pruning
XybridPipeline Multi-stage pipeline execution from YAML
MessageRole Enum: system, user, assistant
LoadEvent Download progress events: LoadProgress, LoadComplete, LoadError

Full API documentation: pub.dev/documentation/xybrid_flutter

License #

Apache 2.0 — see LICENSE

5
likes
130
points
549
downloads

Documentation

API reference

Publisher

verified publisherxybrid.ai

Weekly Downloads

Xybrid Flutter SDK — run ML models on-device or in the cloud with intelligent hybrid routing and streaming inference.

Repository (GitHub)
View/report issues
Contributing

Topics

#machine-learning #ai #inference #edge-computing

License

Apache-2.0 (license)

Dependencies

flutter, flutter_rust_bridge, freezed_annotation, path_provider, plugin_platform_interface

More

Packages that depend on xybrid_flutter

Packages that implement xybrid_flutter