vad 0.0.7+1
vad: ^0.0.7+1 copied to clipboard
VAD is a cross-platform Voice Activity Detection system, allowing Flutter applications to seamlessly handle various VAD events using Silero VAD v4/v5 models.
0.0.7+1 #
- Apply dart format to all files to meet pub.dev static analysis requirements
0.0.7 #
- Add Android 16KB page size support
- Android: Bump ONNX Runtime to 1.22.0 which includes native 16KB page size support
- Android: All native libraries (
libonnxruntime.so,libonnxruntime4j_jni.so) are now properly aligned for 16KB page sizes - Android: Plugin is now fully compatible with Android 15+ devices and Google Play's 16KB page size requirement
- BREAKING CHANGE: Asset management changes - models now loaded from CDN by default
- Core: Remove bundled assets from package to reduce package size
- Core: Introduce companion NPM package
@keyurmaru/vadto host ONNX model files via jsDelivr CDN - Core: Update
baseAssetPathandonnxWASMBasePathparameters to default to CDN URLs (can be overridden for offline/self-hosted use) - Core: Delete
lib/assetsdirectory and its contents - Migration: For offline support, download model files and set
baseAssetPathparameter instartListening()
- BREAKING CHANGE: Model identifier renamed from 'legacy' to 'v4'
- API: Rename model parameter value from 'legacy' to 'v4' for clarity
- Migration: Update
model: 'legacy'tomodel: 'v4'instartListening()calls
- Internal architectural refactor to unify cross-platform implementation
- Internal: Remove platform-specific internal classes (
VadHandlerWeb,VadHandlerNonWeb,VadIteratorWeb,VadIteratorNonWeb) - Internal: Remove internal abstract base classes (
VadHandlerBase,VadIteratorBase) - Internal: Introduce
VadInferenceabstraction layer as platform split point - Core: Use
recordpackage for cross-platform audio capture, replacing custom web audio implementation - Example: Rename
RecordingModelenum toVadModelin example app
- Internal: Remove platform-specific internal classes (
- Internal native implementation overhauled with FFI for better performance
- Internal: Replace
onnxruntimepackage dependency with direct FFI-based implementation - Internal: Add ONNX Runtime C API headers and generate Dart bindings using
ffigen - Internal: Create Dart wrappers for ONNX structs (
OrtEnv,OrtSession,OrtValue) - Internal: Implement
OrtIsolateSessionto run native inference in separate isolate - Internal: Restructure
lib/intocoreandplatformdirectories - Internal: Introduce abstract
VadModelclass to unify inference logic
- Internal: Replace
- Add desktop platform support (Windows, macOS, Linux)
- Platform: Add build configurations for Windows, macOS, and Linux desktop applications
- Platform: Bundle pre-compiled x64 and arm64 ONNX Runtime binaries for Windows and Linux
- Platform: Add CMakeLists.txt for each desktop platform to handle binary packaging
- Platform: Add macOS podspec with dependency on
onnxruntime-objcpod - Platform: Update Dart FFI bindings to dynamically detect OS and CPU architecture at runtime
- Example: Add macOS example app
- Example: Add Linux example app
- Rewrite web VAD implementation in pure Dart
- Web: Remove pre-compiled JavaScript bridge (vad_web.js, bundle.min.js)
- Web: Implement pure Dart web support using
dart:js_interopto directly communicate with onnxruntime-web - Web: Add
MicVADclass inlib/src/web/audio_node_vad.dartto manage audio pipeline usingAudioContextand dynamically generatedAudioWorkletProcessor - Web: Add typed wrappers for
onnxruntime-webinlib/src/web/onnx_runtime_web.dart - Web: Add ScriptProcessorNode fallback from AudioWorklet for better browser compatibility (older browsers and non-secure contexts)
- BREAKING CHANGE:
onEmitChunkstream type changed to includeisFinalflag- API: Change
onEmitChunkfromStream<List<double>>toStream<({List<double> samples, bool isFinal})> - API: Add
onEmitChunkstream for real-time audio chunk emission during active speech - API: Add
isFinalflag to mark the last chunk of a speech utterance - API: Add
numFramesToEmitparameter tostartListening()to enable chunking (default: 0, disabled) - API: Add
endSpeechPadFramesparameter tostartListening()to control audio padding at speech end (default: 1 for v4, 3 for v5) - Core: Update
VadIteratorimplementations to manage frame buffers and emit chunk events periodically and at speech end - Example: Update example app to demonstrate chunk emission feature with playback UI for individual chunks
- Migration: Update listeners from
vadHandler.onEmitChunk.listen((samples) { ... })tovadHandler.onEmitChunk.listen((chunk) { final samples = chunk.samples; final isFinal = chunk.isFinal; ... })
- API: Change
- Network model loading support
- Core: Update native implementation to support loading models from network URLs using
HttpClient
- Core: Update native implementation to support loading models from network URLs using
- Fix Android audio playback interference with VAD detection
- Example: Change audio player configuration to use
AndroidUsageType.mediainstead ofvoiceCommunication - Example: Add
AndroidAudioFocus.gainTransientMayDuckto prevent recorder from receivingAUDIOFOCUS_LOSSwhen playing back recordings - Example: Allow continuous speech detection during playback without requiring manual stop/start
- Example: Change audio player configuration to use
- Fix iOS minimum version requirement
- Platform: Correct minimum iOS version from 16.0 to 15.1 in
vad.podspecto align withonnxruntime-objcdependency support - Example: Update iOS example app configuration to match 15.1 minimum deployment target
- Platform: Correct minimum iOS version from 16.0 to 15.1 in
- Use forked
recorddependency for macOS echo cancellation fix- Dependencies: Temporarily override
recordandrecord_macospackages to point to git repository with echo cancellation fix - Dependencies: Will be removed once fix is merged into official release
- Dependencies: Temporarily override
- Modernize example app build configurations
- Example: Migrate Android Gradle scripts from Groovy to Kotlin DSL (.kts)
- Example: Upgrade Gradle wrapper from 8.3 to 8.12
- Example: Bump Java compatibility to version 11
- Example: Update Android package name to
com.example.vad_example - Example: Clean up iOS Podfile, removing obsolete settings
- Example: Update Xcode project files to match new dependencies
- Update package metadata
- Pubspec: Add
homepageandissue_trackerfields
- Pubspec: Add
- Example: Expose
RecordConfigfromrecordpackage for detailed audio input configuration - Example: Update dependencies
- Bump
permission_handlerto latest version - Bump
audioplayersto latest version
- Bump
- Add support for custom audio streams
- API: Add optional
Stream<Uint8List>? audioStreamparameter tostartListening()method - Core: Allow users to provide their own audio stream instead of using the built-in recorder
- Core: When custom stream is provided, VadHandler bypasses internal AudioRecorder setup
- Core: Custom stream should provide PCM16 audio data at 16kHz sample rate, mono channel
- Example: Add
CustomAudioStreamProviderdemonstration class using therecordlibrary - Example: Add "Use Custom Audio Stream" toggle in settings dialog
- Example: Automatically configure
manageAudioSession: falsewhen custom stream is used - Use case: Enables advanced scenarios like custom recording configurations, audio from non-microphone sources, or integration with existing audio pipelines
- API: Add optional
- Fix AudioRecorder disposal and recreation issue
- Core: Change
_audioRecorderfrom final to nullable field to allow recreation after disposal - Core: Add logic to recreate AudioRecorder instance in
startListening()if it was previously disposed - Core: Prevent "Recorder has already been disposed" error when restarting after stop
- Core: Properly set
_audioRecorderto null after disposal in bothstopListening()anddispose()methods
- Core: Change
0.0.6 #
- BREAKING CHANGE: Convert all VAD APIs to async Future-based methods for better async/await support
- API: Convert
startListening(),stopListening(),pauseListening(), anddispose()methods inVadHandlerBaseto returnFuture<void> - Web: Update
VadHandlerWebimplementation to use async method signatures - Non-Web: Update
VadHandlerNonWebimplementation to use async method signatures and properly await internal async operations - Example: Update example app to use async/await pattern when calling VAD methods
- API: Convert
- introduce
pauseListeningfeature- API: Add
pauseListening()toVadHandlerBase. - Web: implement
pauseListeningImpl()invad_web.jsand expose via JS bindings. - Non-Web: add
_isPausedflag inVadHandlerNonWeb; ignore incoming frames when paused; ifsubmitUserSpeechOnPauseis true, callforceEndSpeech(). - Start/Stop: reset
_isPausedinstartListening(); guardvadInstanceinstopListeningImpl()with null-check and log.
- API: Add
- Add pause/resume UI functionality to example app
- Example: Add dynamic pause button that appears only while actively listening
- Example: Transform start button to "Resume" when paused, calling
startListening()to resume - Example: Hide pause button when paused state is active
- Example: Add separate stop button (red) available in both listening and paused states
- Example: Implement proper state management for
isListeningandisPausedtracking
- Add support for custom
RecordConfigparameter instartListening()for non-web platforms- API: Add optional
RecordConfig? recordConfigparameter tostartListening()inVadHandlerBase. - Non-Web: Use custom
RecordConfigif provided, otherwise fall back to default configuration with 16kHz sample rate, PCM16 encoding, echo cancellation, auto gain, and noise suppression. - Web: Accept the parameter for compatibility but ignore it (not applicable for web platform).
- API: Add optional
- Bump
recordpackage to version 6.0.0 - Example: Bump
permission_handlerpackage to version 12.0.0+1 - Example: Bump
audioplayerspackage to version 6.5.0
0.0.5 #
- Add support for Silero VAD v5 model. (Default model is set to v4)
- Automatically upsample audio to 16kHz if the input audio is not 16kHz (fixes model load failures due to lower sample rates).
- Expose
onRealSpeechStartevent to notify when the number of speech positive frames exceeds the minimum speech frames (i.e. not a misfire event). - Expose
onFrameProcessedevent to track VAD decisions by exposing speech probabilities and frame data for real-time processing. - Update example app to show the
onRealSpeechStartcallback in action and introduce VAD Settings dialog to change the VAD model and other settings at runtime. - For web platform, bundle the required files within the package to avoid download failures when fetching from CDNs and to ensure offline support.
- Update example app to log
onFrameProcesseddetails for debugging.
0.0.4 #
- Fixed a bug where default
modelPathwas not picked up, resulting in silent failure ifmodelPathwas not provided. - Export
VadIteratorclass for manual control over the VAD process for non-streaming use cases. Only available on iOS/Android. - Added comments for all public methods and classes.
0.0.3 #
- Switch to
onnxruntimepackage for inference on a separate isolate on iOS and Android to avoid using a full browser in the background, overall reducing the app size and improving performance. - Example app will show audio track slider with controls while speech segment is being played and it will reflect a misfire event on the UI if occurred.
0.0.2 #
- Fix broken LICENSE hyperlink in README.md and add topics to pubspec.yaml
0.0.1 #
- Initial release