onde_inference 1.1.2
onde_inference: ^1.1.2 copied to clipboard
On-device LLM inference for Flutter & Dart. Run Qwen 2.5 models locally with Metal on iOS and macOS, CPU on Android and desktop. No cloud, no API key.
1.1.2 #
- tvOS support: 0.5B model default, memory-optimized model builder, snapshot cache repair
- New
configure_cache_dirFFI function for sandboxed platforms - New
qwen25_0_5b_config()free function
1.1.1 #
- Apple SDK compatibility: Pulls in the Onde Rust core
1.1.1Apple deployment-target build fixes so downstream macOS consumers do not inherit newer-than-expected minimum OS metadata from the packaged native binaries. - Release alignment: Aligns the Flutter/Dart package version with the Rust, Swift, Kotlin, and React Native SDKs.
1.1.0 #
- Swift / UniFFI stability: Pulls in the Onde Rust core
1.1.0runtime fixes for Swift/Kotlin/Apple-hosted SDKs, including Tokio runtime annotations and panic-safe pulse telemetry initialization. - Telemetry: Pulse telemetry now gracefully disables itself if a Tokio reactor is unavailable and can be explicitly disabled with
ONDE_DISABLE_PULSE=1during local validation. - Release alignment: Aligns the Flutter/Dart package version with the Rust, Swift, Kotlin, and React Native SDKs.
1.0.2 #
- pub.dev: Added Apple plugin Swift Package Manager manifests at
ios/onde_inference/Package.swiftandmacos/onde_inference/Package.swiftso pub.dev recognizes modern iOS and macOS plugin toolchain support. - pub.dev: Removed redundant analyzer ignore directives from generated Dart bridge files to recover the remaining static-analysis points.
- Tooling: Ignore Swift Package Manager build directories (
.build/,.swiftpm/) in both Git and pub publishing.
1.0.1 #
- pub.dev: Fixed stale README and API examples so the published package matches the current API (
OndeChatEngine(), namedmessage:arguments, current sampling helpers, andOndeErrorhandling). - pub.dev: Added explicit package metadata for documentation, topics, and supported platforms to improve pub.dev platform detection and package discoverability.
- pub.dev: Verified lower-bound compatibility again after the FRB-generated
toolCallsandloadAssignedModelchanges that previously hurt the package score on 1.0.0.
1.0.0 #
This is the first stable release. Onde has already been running in real App Store apps for months, so keeping it on 0.x no longer felt right.
New: assigned model loading #
loadAssignedModel(appId:appSecret:) fetches the model you've assigned to your app in the ondeinference.com dashboard. If nothing has been assigned yet, it falls back to the platform default. This is the path we recommend for production apps. loadDefaultModel is still there if you just want to prototype quickly.
The example app reads credentials from --dart-define=ONDE_APP_ID=... and --dart-define=ONDE_APP_SECRET=.... If you do not pass them, it falls back to the default model like before.
New models #
- Qwen 3 8B, 14B, and 1.7B (GGUF Q4_K_M)
- Qwen 2.5 Coder 7B (GGUF Q4_K_M)
- DeepSeek Coder 6.7B (GGUF Q4_K_M) with bundled chat template
Type changes #
GgufModelConfignow has an optionalchatTemplatefield for models that need a custom chat template, like DeepSeek Coder.InferenceResultnow carries atoolCallslist (List<ToolCallInfo>). Most responses will still return an empty list, but if the model asks for a tool call, you now get structured data instead of raw markup intext. TheInferenceResultToolsXextension also adds ahasToolCallsconvenience getter.
Engine #
- Old model weights are now dropped outside the lock when loading a new model. Before this, the drop happened while the write lock was still held, which meant status queries could stall while memory was being released.
Dependencies #
- Switched from the git-based
mistralrsdependency to the publishedonde-mistralrs 0.8.2crates on crates.io. Builds are faster, andcargo publishno longer needs[patch.crates-io]gymnastics.
Cross-platform #
- Linux and Windows builds with CPU inference now work out of the box, thanks to a
TokenSourcefix for non-Darwin platforms.
0.1.7 #
- Fix: Removed
example/android/app/src/main/java/io/flutter/plugins/GeneratedPluginRegistrant.javafrom git tracking and added it to.gitignore. Flutter regenerates this file on every CI run, which kept leaving the working tree dirty and blockedpub publish. Added a CI restore step as an extra safety net.
0.1.6 #
- Fix: Replaced the composite
LICENSEfile with the standard MIT license text so pub.dev'spanatool correctly recognises the OSI-approved license and gives the package full license credit.
0.1.5 #
- Engine: Added
load_assigned_model(). It fetches the model config assigned to your app from the Onde SDK backend using app credentials, with no user JWT required. If no model has been assigned yet, it falls back to the platform default. - Telemetry: Added the GresIQ pulse telemetry client. The engine now reports usage events to the GresIQ dashboard. Configure it with the
GRESIQ_ENVIRONMENTandONDE_EDGE_IDenvironment variables before the engine starts. - Build: GresIQ API credentials (
GRESIQ_API_KEY,GRESIQ_API_SECRET,GRESIQ_APP_ID) are now embedded at build time throughdotenvy, so CI can inject secrets through environment variables without changing source files.
0.1.4 #
- Added the Qwen 3 4B GGUF model (
bartowski/Qwen_Qwen3-4B-GGUF) with full OpenAI-compatible tool calling support. - Added the
GgufModelConfig.qwen3_4b()constructor and registered it in the supported model list.
0.1.3 #
- Platform: Added support for watchOS and visionOS.
0.1.2 #
- Engine: Switched all platform-specific
mistralrsandmistralrs-coredependencies to thesetoelkahfi/mistral.rsfork (branchfix/all-platform-fixes) to pick up cross-platform stability fixes before they landed upstream. - License: Moved to dual licensing under MIT OR Apache-2.0, and added
LICENSE-APACHEalongside the existingLICENSE-MITfor pub.dev compliance. - Dependencies: Upgraded
freezed_annotationto^3.1.0andfreezedto^3.2.5. - Removed a stale
ignore_for_filedirective from the generated Flutter Rust Bridge glue code.
0.1.1 #
- CI/CD:
release-sdk-dart.ymlnow publishesonde_inferenceto pub.dev on tag push. - Added copyright headers to all hand-written source files (
engine.dart,types.dart,dart_test.dart, and the iOS and macOS Swift plugin classes). - Rewrote the example app README with updated branding, platform notes, and an SDK quick reference.
- Added
android/local.propertiesto.gitignore, so local SDK paths no longer show up in diffs.
0.1.0 #
- Initial MVP release.
- Multi-turn chat inference with Qwen 2.5 1.5B and 3B GGUF Q4_K_M models.
- Streaming token delivery via Dart
Stream<StreamChunk>, so you can display tokens as they are generated. - Metal acceleration on iOS and macOS (Apple silicon and Intel).
- CPU inference on Android, Linux, and Windows.
- Platform-aware default model selection (1.5B on iOS / Android, 3B on macOS / Linux / Windows).
- Conversation history management:
history(),clearHistory(),pushHistory(). - One-shot
generate()API that does not affect the conversation history. - Configurable sampling: temperature, top-p, top-k, min-p, max tokens, frequency and presence penalties.
- Built-in sampling presets:
SamplingConfig.defaultConfig(),SamplingConfig.deterministic(),SamplingConfig.mobile(). EngineInfosnapshot: status, loaded model name, approximate memory, and history length.OndeInferencestatic helper namespace for library initialisation and model / sampling config factories.- Compilation stub (
frb_generated_stub.dart) so the package compiles before the native Rust bridge is built. - Powered by flutter_rust_bridge v2 and the Onde Rust engine.