sherpa_onnx changelog | Flutter package

1.13.0 #

Fix Flutter CI (#3560)
Fix building Flutter Android APPs (#3559)
Export nvidia/parakeet-unified-en-0.6b to sherpa-onnx (#3556)
Update nemotron-speech-streaming-en-0.6b (#3555)
Expose log probabilities in OfflineRecognizerResult for Go binding (#3553)

1.12.40 #

Add more Piper TTS models (#3547)
Add two Piper Chinese TTS models (#3546)
Upload the piper sq_AL model (#3541)
Add Albanian (sq_AL) Piper TTS voice by LanguageWeaver (#3539)
Add Tauri APP demo for VAD+ASR from a microphone (#3540)
Improve Tauri VAD+ASR example: settings UI, bug fixes, and RTF display (#3538)
Fix releasing go packages (#3537)
Fix publishing dart packages for Android (#3522)
Avoid passing invalid utf8 str to JNI (#3527)
Build Tauri desktop APPs for VAD+ASR (#3530)

1.12.39 #

Add tauri app example for non-streaming ASR + VAD (#3520)
Check for nullptr in c/cxx API (#3515)
Update go.mod and go.sum for the sherpa-onnx-go package. (#3516)
Fix an offset-by-one error in pyannote speaker diarization. (#3514)
Set context for thread in Ascend NPU (#3512)
Enable proxy support for downloader via ureq's proxy-from-env feature in rust-binding build.rs (#3507)
Add script to download all dependencies for offline use (#3506)
Update Eigen to v5.0.1 (#3505)
Fix building Python wheels for Windows (#3504)
Expose memory-related session options via config (#3503)

1.12.38 #

Update onnxruntime to use v1.24.4 (#3501)

1.12.37 #

Update openfst to v1.8.5 and fix compiler warnings. (#3495)
Use onnxruntime 1.24.3 for Android (#3494)
Update FunASR Nano int8 models (#3493)
Fix incorrect URL in dotnet README (#3491)
Fix package name format for ort url (#3489)

1.12.36 #

Dotnet Android arm64 targeting (#3485)
Use file path to initialize onnxruntime session. (#3482)
update onnxruntime package name and hash for riscv64-spacemit (#3481)
Fix initializing fp32 Qwen3-ASR models. (#3480)
Fix Qwen3ASR hotwords handling (#3477)
Update Qwen3 ASR models (#3476)
Fix building for vad+asr (#3475)
Add per-stream language hint for Qwen3-ASR (and tighten scaffold cleanup) (#3472)
Add hotwords argument to OfflineQwen3ASRModelConfig (#3468)

1.12.35 #

Add Go API for Cohere Transcribe (#3466)
Add Rust API for Cohere Transcribe (#3465)
Add Dart API for Cohere Transcribe (#3464)
Add C# API for Cohere Transcribe (#3462)
Add Pascal API for Cohere Transcribe (#3463)
Add Java and Kotlin API for Cohere Transcribe (#3461)
Add JavaScript API for Cohere Transcribe. (#3458)
Add Swift API for Cohere Transcribe (#3460)
Add C API and CXX API for Cohere transcribe models (#3457)
Add C++ runtime and Python API for Cohere Transcribe (#3456)
Upload models for https://huggingface.co/CohereLabs/cohere-transcribe-03-2026 (#3453)
Add issue template (#3452)
Add hotwords support for Qwen3-ASR (#3434)
Fix links in README.md to use correct casing (#3451)
Add Wake Word and decibri to projects using sherpa-onnx (#3447)
Remove num_threads assertion from OnlineRecognizer model configs (#3448)
Add Speed of Sound to projects using sherpa-onnx (#3438)
Fix Swift tests (#3433)
spacemit provider: add provider with config file, then update to 2.0.2 (#3421)
Add Go API for source separation (#3432)
Fix C API for reading multi-channel wave files (#3430)
Add Swift API for source separation (#3426)
Fix CI tests (#3429)

1.12.34 #

Add Pascal API for Qwen3 ASR (#3424)
Add Rust API for Qwen3 ASR (#3423)
Add Dart API for Qwen3 ASR (#3422)
Add Java API and Kotlin API for Qwen3 ASR (#3420)
Add JavaScript API (node-addon) for Qwen3 ASR (#3419)
Add JavaScript API (WebAssembly) for Qwen3 ASR (#3416)
Refactor Go API examples to use sherpa_onnx.ReadWave() (#3415)
Fix warnings in Go API (#3414)
Add C# API for Qwen3 ASR (#3413)
Add Go API for Qwen3 ASR (#3412)
Update Swift API example for Qwen3 ASR (#3411)
Update tests for Qwen3 ASR (#3410)
Upload Qwen3 ASR 0.6B int8 models (#3409)
Add Qwen3-ASR support (#3399)
Fix TTS deprecated warnings (#3407)
Add C# API for source separation (#3406)
Add CXX API for source separation (#3405)
Add C API for source separation (#3404)
Fix building Rust doc (#3403)
Update Python subtitle script to support FireRedASR CTC and FunASR Nano (#3400)

1.12.33 #

Add Rust examples for simulated streaming ASR with VAD (#3398)
Add Rust example for real-time ASR + VAD (#3397)
Fix printing microphone sample rate. (#3396)
Add missing fields of OnlineRecongizerResult to Go API. (#3395)
Add examples for punctuations (#3394)
Fix building MFC examples for TTS (#3388)

1.12.32 #

Support static link for Rust package. (#3386)
Test Rust API on Windows (#3385)
Add VoxSherpa TTS to Projects using sherpa-onnx (#3384)

1.12.31 #

Fix building har for OHOS (#3361)
Refactor MatchaTTS to use the new Generate API (#3362)
Refactor Kokoro TTS to use the new Generate API (#3363)
Refactor KittenTTS to use the new Generate API (#3364)
Refactor VITS to use the new Generate API (#3365)
Add Rust API examples for TTS (#3366)
Fix Swift tests (#3367)
Add Rust API for audio tagging (#3368)
Add Rust API for speaker embedding extractor and manager (#3369)
Add Rust API for speaker diarization (#3370)
Refactor Rust API for speech denoiser (#3371)
Add Rust API for KWS, offline punctuation and spoken language identification (#3372)
Add doc for c api and cxx api (#3374)
Add link to C API doc (#3375)
Add doc for Rust API (#3376)
Add doc for Dart API (#3377)
Add more doc for Rust API (#3378)

1.12.30 #

Fix typos in the project (#3293)
Fix WebAssembly JavaScript API (#3294)
Remove unnecessary SHERPA_ONNX_API from C/C++ APIs (#3295)
Fix bugs in CXX APIs (#3296)
Result goes to stdout (#3274)
Small fix to online recognizer C++ code (#3297)
Small fixes to JNI wrappers (#3298)
Add SetOption/GetOption to OnlineStream and OfflineStream (#3307)
Add SetOption/GetOption C API and export symbols (#3308)
Add SetOption/GetOption CXX wrapper for OnlineStream and OfflineStream (#3309)
Migrate Paraformer is_final to use SetOption mechanism (#3310)
Add SetOption/GetOption Python bindings for OnlineStream and OfflineStream (#3311)
Add SetOption/GetOption Java, Kotlin, and JNI bindings (#3312)
Add SetOption/GetOption Go bindings for OnlineStream and OfflineStream (#3313)
Add SetOption/GetOption C# bindings for OnlineStream and OfflineStream (#3314)
Add SetOption/GetOption WASM/JavaScript bindings (#3315)
Fix padding bug in test-onnx-streaming.py (#3318)
Fix style issues (#3321)
Add DPDFNet speech denoiser support for offline and streaming (#3276)
Upload DPDFNet models (#3322)
Add C API example for online punctuation (#3323)
Add online speech denoiser for GTCRN and examples (#3324)
Add Go API example for online punctuation (#3325)
Release Rust package for offline/online speech denoiser (#3328)
Refactor Dart API to check for nullptr (#3329)
Use onnxruntime v1.23.2 for Android (#3330)
Refactor ZipVoice TTS to support callback (#3332)
Add C and CXX API examples for ZipVoice (#3333)
Add Go API examples for ZipVoice (#3334)
Add Python API examples for ZipVoice TTS (#3335)
Add WebAssembly example for ZipVoice (#3337)
Update WebAssembly download progress text to show MB (#3338)
Add WebAssembly example for PocketTTS (#3340)
Add JavaScript (WebAssembly) example for ZipVoice TTS (#3341)
Add JavaScript (node-addon) example for ZipVoice TTS (#3342)
Add JavaScript playback examples for Pocket and Supertonic TTS (#3343)
Add Kotlin and Java API for ZipVoice models (#3344)
Add C# API examples for ZipVoice models (#3345)
Add Swift API examples for ZipVoice models (#3346)
Add Dart API examples for ZipVoice models (#3347)
Add Rust API example for online punctuation (#3348)
Add fcitx5-vinput to projects using sherpa-onnx (#3350)
Add Pascal API examples for ZipVoice models (#3351)
Add Rust API examples for ZipVoice models (#3352)
Add SetOption/GetOption/HasOption Kotlin bindings (#3354)
Fix building Python wheels for Windows (#3355)
Fix OHOS APIs for TTS and ASR (#3356)
Add HarmonyOS APIs for online punctuation (#3357)
Add HarmonyOS APIs for offline punctuation (#3359)

1.12.29 #

Add Supertonic TTS support (#3094)
Upload supertonic tts models (#3263)
Add Python API examples for Supertonic TTS (#3264)
Support dynamic decoder layers in canary model runtime (#3268)
Add CXX API for Supertonic TTS (#3280)
Add C# API for Supertonic TTS (#3283)
Add Go API for Supertonic TTS (#3284)
Add Rust API for Supertonic TTS (#3285)
Add Swift API for Supertonic TTS (#3286)
Add JavaScript API for Supertonic TTS (#3287)
Add Dart API for Supertonic TTS (#3288)
Add Java and Kotlin API for Supertonic TTS (#3289)
Add Pascal API and example for Supertonic TTS (#3290)
Publish pdb files for Debug build on Windows (#3252)
Fix memory leak in WebAssembly for TTS (#3259)
Refactor WebAssembly TTS API (#3260)

1.12.28 #

Add C++ runtime support for Moonshine v2 (#3232)
Export Moonshine v2 models to sherpa-onnx (#3234)
Update Python APIs for Moonshine v2 models (#3235)
Add Kotlin and Java APIs for Moonshine v2 models (#3237)
Add C and C++ API for Moonshine v2 models (#3238)
Add Swift API for Moonshine v2 models (#3240)
Add JavaScript API (WebAssembly) for Moonshine v2 models (#3241)
Add JavaScript API (node-addon) for Moonshine v2 models (#3242)
Add C# API for Moonshine v2 (#3243)
Add Go API for Moonshine v2 (#3244)
Add Dart API for Moonshine v2 (#3245)
Add Rust API for Moonshine v2 (#3247)
Add Pascal API for Moonshine v2 (#3248)
Build huggingface spaces for Moonshine v2 with WebAssembly (#3249)

1.12.27 #

Add Rust API for VAD (#3213)
Replace deprecated std::istrstream with std::istringstream (#3214)
Replace deprecated std::wstring_convert with manual UTF-8 codec (#3215)
Fix CMake warnings: optional feature message level + policy version minimum (#3217)
Upload FireRedASR2 CTC model (#3220)
Bump hclust-cpp to 2026-02-25 release and modernize FetchContent (#3216)
Support FireRedASR CTC models (#3221)
Update language bindings for FireRedASR CTC models (#3224)

1.12.26 #

Fix CI (#3192)
Fix heap-buffer-overflow in ReadWaveImpl when data chunk size is odd (#3195)
[PocketTTS] Add seed support and voice embedding caching for consiste… (#3189)
Feat/pocket tts cache config (#3200)
3197: enhanced java binding for voice_embedding_cache_capacity (#3201)
Dart, flutter, go, c-api binding and example (#3202)
Begin to add Rust API (#3203)
Add Rust API for streaming speech recognition (#3204)
Add a real-time speech recognition example with microphone for Rust API. (#3205)
Add Rust API for offline ASR (#3207)
feat: Add PocketTTS cache & seed support to Node.js Addon and WASM APIs (#3206)
Add more examples for offline ASR models with Rust API. (#3209)
Update C#/Swift/Pascal API for PocketTTS' VoiceEmbeddingCacheCapacity. (#3211)

1.12.25 #

Fix building without tts (#3168)
Fix publishing npm packages for Linux aarch64 and wheels for macOS (#3169)
Export PocketTTS for earlier versions of onnxruntime (#3170)
Fix building wheels for Python 3.14 (#3182)
Update Eigen from 3.4.0 to 3.4.1 (#3178)
fix(flutter): add missing FFI struct fields for OfflineWhisper and FunAsrNano (#3186)
Fix building wheels for Windows (#3187)

1.12.24 #

Fix UnicodeDecodeError when accessing tokens in FunASR-nano tokenizer (#3058)
Use more jobs for building VAD ASR APKs (#3068)
Add export CGO_ENABLED=1 to all GO examples. (#3069)
Support BPE tokenizer (#3078)
Add C++ runtime and Python support PocketTTS for streaming voice cloning on CPU (#3083)
Refactor addon loading logic and add static import for platform-specific binaries (#3075)
Update C++ binary for PocketTTS (#3087)
Add Python API examples for PocketTTS (#3088)
Limit text length for PocketTTS. (#3089)
Add CI for PocketTTS. (#3090)
Fix Python CI (#3091)
Fix build error (#3096)
Add Java and Kotlin API for PocketTTS (#3095)
Refactor JNI to remove casting. (#3103)
Refactor JNI (#3107)
Support MD and MT MSVC runtime libraries (CRT) for Windows x64 static build (#3111)
Fix MSVC CRT for Windows x64 shared build. (#3114)
Fix MSVC CRT for Windows arm64 (#3117)
Fix MSVC CRT for Windows x86 (#3118)
Refactor CI for Windows x64 (#3119)
Fix CI for Windows x64 (#3123)
Upload WenetSpeech-Wu u2pp ASR models. (#3125)
Add TTS generation with GenerationConfig params C API (#3115)
Refactor TTS C API (#3127)
Add CXX API for PocketTTS (#3128)
Add Swift API for PocketTTS (#3129)
fix(android): Optimize UI updates and remove dead code in MainActivity (#3130)
Change RPATH for sherpa-onnx.node (#3131)
Add async js API for tts generate. (#3133)
fix(android): Initialize models in background coroutine to avoid UI blocking (#3132)
Add hotword support for FunASR-Nano (#3122)
Provide async JS API to create TTS. (#3134)
feat: Add a WebAssembly Text-to-Speech (TTS) demo with UI and worker-based audio generation using sherpa-onnx. (#3120)
feat: add support for Meta Omnilingual ASR v2 models (#3138)
Export omnilingualASR v2 (#3140)
feat: Add ys_log_probs to NeMo transducer greedy search decoder (#3105)
Add modified beam search and hotwords support for NeMo transducer models (#3077)
Fix ORT Value default construction for Android build (#3141)
Whisper timestamps (#2945)
Add node-addon JavaScript API for PocketTTS (#3139)
Update lifecycle-runtime-ktx version to 2.5.1 (#3143)
Enable return value in callback for TTS in Go API. (#3150)
Refactor Go API for TTS (#3151)
Export models for CANN 8.1 (#3152)
Add Go API for PocketTTS (#3153)
Export models for CANN 8.3 and 8.5 (#3156)
Add https://huggingface.co/alphacep/vosk-model-small-streaming-bn (#3158)
Add Pascal API for Pocket TTS (#3157)
Upload Vietnamese ASR models (#3159)
Refactor Pascal API (#3160)
Add C# API for PocketTTS. (#3162)
Add JavaScript (WebAssembly) API for PocketTTS (#3163)
Add Dart API for PocketTTS (#3164)
Add GeneratedAudio ToBuffer() to the GO API (#3136)
fix: resolve high vulnerability python.lang.security.audit.dangerous-system-call-tainted-env-args.dangerous-system-call-tainted-env-args (#3155)
Fix various language bindings (#3166)

1.12.23 #

Node addon api jsdoc (#3005)
Add JavaScript async api for OfflineRecognizer decodeStream. (#3049)
Support creating OfflineRecognizer asynchronously in JavaScript. (#3050)
Fix uploading files to huggingface (#3054)
Add Dart API for FunASR Nano (#3055)
Fix uploading APK files (#3056)

1.12.22 #

Update wav files for FunASR Nano (#3038)
cmake: fix sha256 for onnxruntime linux x86_64 gpu package (#3042)
Fix checking funasr nano tokenizer on Windows (#3043)
Support nemotron-speech-streaming-en-0.6b (#3044)
Build APK for nemotron-speech-streaming-en-0.6b (#3045)
Fix building Linux arm wheels (#3047)

1.12.21 #

Fix publishing NPM packages (#2909)
Refactor ZipVoice C++ code (#2911)
Export more zipformer ctc models to qnn (#2921)
[KWS] Add phone+ppinyin tokenization with lexicon support (for zh-en model) (#2922)
Export Paraformer ASR models to QNN (#2925)
Add Transpose for a 2-D matrix. (#2926)
Optimize computation with Eigen. (#2928)
Add C++ runtime for Paraformer ASR models with Qualcomm NPU using QNN (#2931)
Add Android demo for Paraformer ASR with Qualcomm NPU. (#2932)
Export Google MedASR to sherpa-onnx (#2934)
Add C++ runtime and Python API for Google MedASR models (#2935)
Fix creating a view of an Ort::Value tensor. (#2939)
Add C and CXX API for Google MedASR model (#2946)
[TTS Engine] Fix engine speed (#2895)
Add Swift API for Google MedASR model (#2947)
Add C# API for Google MedASR model (#2949)
Add Pascal API for Google MedASR model (#2950)
Add Go API for Google MedAsr model (#2952)
Add Dart API for Google MedAsr model (#2953)
Add JavaScript API (WebAssembly) for Google MedAsr model (#2954)
Add JavaScript API (node-addon) for Google MedAsr model (#2955)
Add Kotlin and Java API for Google MedAsr model (#2956)
Add funASR-Nano with LLM support (#2936)
Fix building for Windows (#2964)
Fix building for HarmonyOS (#2972)
[feature] add FunASRNano config into golang api (#2974)
Update FunAsr-Nano CTC model (#2978)
[opt] opt free pointer function in Go API (#2975)
[feature] use jinja2 to generate sherpa-onnx-go lib (#2976)
Reformat Go API code (#2979)
Fix building for onnxruntime >= 1.11.0 (#2981)
Export Whisper to RK NPU (#2983)
Test Whisper on Ascend NPU using ACL Python API (#2986)
FunASR-nano: switch to unified KV-cache LLM (#2995)
Remove filesystem header (#2998)
Fix(csrc/melotts): Fix V-words pronunciation on MeloTTS_en (#3002)
Upload FunASR Nano ASR models with LLM (#3003)
Fix download test wav files (#3004)
Use onnxruntime 1.23.2 for Windows (#3007)
Add CI to export Whisper models to Ascend NPU (#3008)
Add C++ runtime for Whisper with Ascend NPU (#3009)
Use onnxruntime v1.23.2 for Linux aarch64 (#3016)
Use onnxruntime v1.23.2 for Linux arm (#3017)
Start to switch from onnxruntime 1.17.1 to v1.23.2 (#2993)
Use onnxruntime 1.23.2 for Linux x64 + NVIDIA GPU (#3018)
Update CI test for FunASR Nano C/C++ API (#3021)
[feature] add FunASRNano Swift api (#2994)
swift: add FunASR nano Swift API (#3022)
Add Go API test for FunASR Nano (#3025)
Add JavaScript API for FunASR Nano (node-addon) (#3026)
Add Pascal API for FunASR Nano (#3029)
Add C# API for FunASR Nano (#3031)
Add Kotlin and Java API for FunASR Nano models (#3030)
Fire-Red-ASR: enable ORT I/O binding for encoder/decoder (#3011)
whisper: improve ORT IO binding execution (#3023)
Add JavaScript API for FunASR Nano (WebAssembly) (#3027)
Fix CI test for nodejs (#3033)

1.12.20 #

Refactor axcl examples. (#2867)
Update README to include Axera NPU (#2870)
Add CI for Axera NPU (#2872)
Refactor sense voice impl (#2873)
Refactor Paraformer Impl (#2874)
Remove unused lock file (#2875)
Load QNN context binary for faster startup (#2877)
Export models to Ascend 910B4 (#2878)
Optimize streaming output results when VAD does not detect human voice for a long time (#2876)
Build APKs for MatchaTTS Chinese+English (#2882)
Publish WASM spaces for MatchaTTS Chinese+English model (#2885)
Add script for testing zipvoice onnx models (#2887)
upload zipvoice onnx models (#2890)
Remove cppinyin from zipvoice (#2892)
Fix building errors (#2893)
Use a shorter name for Zipvoice models. (#2894)
Export GigaAM v3 to sherpa-onnx (#2901)
Fix typos in URL (#2905)
Support Fun-ASR-Nano-2512 (#2906)

1.12.19 #

Fix building without TTS for C API (#2838)
[ZipVoice] Fix english tokenization error (#2834)
Add simulate streaming ASR Python example for Paraformer (#2839)
Fix building JNI for Windows (#2840)
Avoid NaN in NeMo speaker embedding models. (#2844)
Add spacemit ort ep for spacemit riscv cpus (#2837)
Add token-level confidence scores (ys_probs) for offline transducer models (#2843)
Fix token log probabilities in offline transducer modified beam search decoder (#2846)
Support AXERA ax630, ax650, and axcl backends. (#2849)
Refactor axera npu examples (#2850)
Fix matcha tts zh-en model (#2851)
Fix the English part for Matcha TTS. (#2853)
Refactor text-utils (#2855)
Fix matcha tts (#2856)
Add a space between English words for Matcha zh-en TTS (#2858)
Fix punctuations in matcha zh-en tts (#2859)
Upload matcha tts zh-en model (#2865)
Fix the discrepancy with the Silero VAD isSpeech logic (#2863)

1.12.18 #

Fix building wheels (#2786)
export omniASR_CTC_1B (#2788)
Add C++ QNN support for SenseVoice (#2793)
Export models for CANN toolkit 7.0 (#2795)
Support hotwords with byte level bpe (#2802)
Add Android demo with QNN (Qualcomm NPU) for SenseVoice ASR (#2803)
Export zipformer ctc models to QNN (#2815)
Add spaces between English words for Homophone replacer. (#2817)
Add C++ QNN support for Zipformer CTC models. (#2809)
Limit symbol visibility in the shared libraries (#2822)
Fix warnings for initializing tts lexicon. (#2823)
Export zipformer ctc models to Ascend NPU (#2824)
Refactor scripts for exporting models to Ascend NPU. (#2825)
Add C++ support for Zipformer CTC on Ascend NPU (#2826)
Fix segfault when non-wav file is passed to ReadWave (#2821)
Avoid calling rknn_dup_context(). (#2828)
Add C++ support for Paraformer with RK NPU (#2829)
Update README to include NPU support (#2830)
Support running whisper large v3 with external data weight (#2807)

1.12.17 #

Fix releasing

1.12.16 #

Support exporting SenseVoice and Paraformer to Ascend 310P3 NPU. (#2716)
Demo for no stream vad asr with flutter (#2705)
Fix crashing in Android KWS demo (#2719)
Add C++ API with ACL C API for SenseVoice ASR on Ascend NPU (#2728)
Allow up to 30 seconds ASR for sense-voice on Ascend NPU (#2729)
Fix compilation error for Ascend NPU (#2731)
docs: fix Flutter TTS macOS mirror link targets; fix speech-enhancement link typo (#2723)
Export models for Ascend910B2 (#2740)
Add C++ runtime for Paraformer on Ascend NPU. (#2741)
Expose ys probs to JNI, Kotlin and Java API (#2736)
Add CI for Ascend NPU (#2743)
Export models for CANN 8.2 (#2745)
Fix validating model config for Paraformer. (#2749)
Add cxx API for online punctuation models (#2759)
Export sense voice to qnn (#2760)
Export models to Ascend 910B3 (#2761)
Support MatchaTTS models for Chinese+English. (#2763)
Fix zipvoice. (#2764)
Support passing multiple lexicon files for matcha tts models. (#2765)
Begin to add qnn C API (#2766)
Add QnnConfig. (#2768)
Fix missing includes. (#2769)
Begin to export omnilingual-asr to sherpa-onnx (#2770)
Add C++ and Python API for Omnilingual ASR models. (#2772)
Add C API for Omnilingual ASR CTC models (#2773)
Add CXX API for Omnilingual ASR CTC models (#2774)
Add C# API for Omnilingual ASR CTC models (#2775)
Add Swift API for Omnilingual ASR CTC models (#2776)
Add Go API for Omnilingual ASR CTC models (#2778)
Add JavaScript (node-addon) API for Omnilingual ASR CTC models (#2780)
Add Dart API for Omnilingual ASR CTC models (#2779)
Add JavaScript (WebAssembly) API for Omnilingual ASR CTC models (#2781)
Add Pascal API for Omnilingual ASR CTC models (#2782)
Add Kotlin and Java API for Omnilingual ASR CTC models (#2783)

1.12.15 #

Exposing online punctuation model support in node-addon-api (#2609)
Fix building wheels (#2619)
Export one more Piper Arabic TTS model (#2623)
fix: hot update language for sencevoice (#2627)
Add C API and Go API for Zipvoice (#2628)
Add CI tests for Zipvoice Go API (#2630)
Remove hardcoded dithering value in NeMo transducer recognizer (#2639)
Reduce verbose output about reading lexicon for TTS (#2648)
Add Parakeet TDT model for generating subtitles (#2649)
Add more Piper TTS models (#2651)
Add CXX API for audio tagging (#2652)
Add C# API for audio tagging (#2653)
Support KWS + RKNN. (#2190)
Support https://github.com/ASLP-lab/WenetSpeech-Chuan (#2656)
Fix building for android (#2657)
fix ios build script (#2645)
Update kaldi-native-fbank (#2659)
Add missing python class definitions for builds without TTS support (#2660)
Remove jieba from kokoro and matcha tts. (#2662)
add flet_sherpa_onnx in readme (#2663)
Remove cppjieba (#2664)
Add phrase matcher to merge words into phrases for TTS. (#2668)
Limit number of tokens per sentence in MatchaTTS. (#2671)
Update README to include a ROS2 project using sherpa-onnx (#2672)
Fix building Flutter APPs (#2673)
Export Paraformer to RKNN (#2689)
Update README.md add achatbot-go Projects using sherpa-onnx link (#2691)
Add CI to export Paraformer to RKNN (#2692)
Support MatchaTTS with English and Chinese (#2695)
Export Paraformer ASR models from FunASR to Ascend NPU 910B (#2697)
Update README to include Ascend NPU (#2698)
Fix WASM (JS) after adding zipvoice. (#2702)
Export SenseVoice ASR models to Ascend NPU 910B (#2707)
Fix building for various language bindings after adding zipvoice (#2709)

1.12.14 #

Fix setting rknn core mask (#2594)
Add Dart API for spoken language identification (#2596)
Add CI tests for dart spoken language identification example (#2598)
Provide pre-compiled sherpa-onnx libs/binaries for CUDA 12.x + onnxruntime 1.22.0 (#2599)
Provide pre-compiled whls for cuda 12.x on Linux x64 and Windows x64 (#2601)
Fix TDT decoding for NeMo TDT transducers (#2606)
Add a C++ example for simulated streaming ASR (#2607)

1.12.13 #

Fix initializing symbol table for OnlineRecognizer. (#2590)
Support RK NPU for SenseVoice non-streaming ASR models (#2589)
Upload RKNN models for sense-voice (#2592)

1.12.12 #

Fix building for risc-v (#2549)
Fix using sherpa-onnx as a cmake sub-project. (#2550)
Update kaldifst and kaldi-decoder (#2551)
Support armv8l in Java API (#2556)
Disable loading libs from jar on Android. (#2557)
Fix cantonese vits tts (#2558)
Avoid appending blanks for Cantonese vits tts. (#2559)
Add hint for loading model files from SD card on Android. (#2564)
Update README to include https://github.com/Mentra-Community/MentraOS (#2565)
Export models from https://github.com/voicekit-team/T-one to sherpa-onnx (#2571)
Add C++ and Python support for T-one streaming Russian ASR models (#2575)
Add various language bindings for streaming T-one Russian ASR models (#2576)
Fix the missing online punctuation in android aar (#2577)
Export KittenTTS mini v0.1 to sherpa-onnx (#2578)
Upload new sense-voice models (#2580)
Export ASLP-lab/WSYue-ASR/tree/main/u2pp_conformer_yue to sherpa-onnx (#2582)
Add various language bindings for Wenet non-streaming CTC models (#2584)

1.12.11 #

Add two more Piper tts models (#2525)
Generate tts samples for MatchaTTS (English). (#2527)
Fix releasing go packages (#2529)
Add license info about tts models from OpenVoiceOS (#2530)
Support BPE models with byte fallback. (#2531)
Simplify the usage of our non-Android Java API (#2533)
Fix wasm for kws (#2535)
Add one more German tts model from OpenVoiceOS. (#2536)
Fix uploading win32 libs to huggingface (#2537)
Add Zipvoice (#2487)
Fix c api (#2545)
Fix linking (#2546)

1.12.10 #

Add VOSK streaming Russian ASR models and Kroko streaming German ASR models (#2502)
Refactor CI tests (#2504)
Update APK versions (#2505)
Export whisper distil-large-v3 and distil-large-v3.5 to sherpa-onnx (#2506)
Support specifying pronunciations of phrases in Chinese TTS. (#2507)
fix(flutter): fix unicode problem in windows path (#2508)
feat: add punctuation C++ API (#2510)
Fix ctrl+c may lead to coredump (#2511)
Add kitten tts nano v0.2 (#2512)
Scripts to generate tts samples (#2513)
Add tdt duration to APIs (#2514)
Support 16KB page size for Android (#2520)
Split sherpa-onnx Python package (#2521)
Fix kokoro tts for punctuations (#2522)

1.12.9 #

Add more piper tts models (#2480)
Fix ASR for UE (#2483)
push to maven center (#2463)
Specify ABIs when building APKs (#2488)
Add more debug info for vits tts (#2491)
Add Swift API for computing speaker embeddings (#2492)
Alex/feat add python example (#2490)
Support TDT transducer decoding (#2495)
Fix java test (#2496)
Refactor Swift API (#2493)
add TtsReader app to README.md (#2498)
Export https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3 to sherpa-onnx (#2500)
Fix building apk (#2499)

1.12.8 #

Expose JNI to compute probability of chunk in VAD (#2433)
Add https://huggingface.co/Banafo/Kroko-ASR (#2453)
Add APIs for Online NeMo CTC models (#2454)
Export https://github.com/KittenML/KittenTTS to sherpa-onnx (#2456)
Fix punctuations in kokoro tts. (#2458)
Limit number of tokens in fire red asr decoding. (#2459)
Add C++ runtime for kitten-tts (#2460)
Add Kotlin and Java API for KittenTTS (#2461)
Add Android TTS Engine APK for KittenTTS (#2465)
Add Python API for KittenTTS. (#2466)
Add C API for KittenTTS (#2467)
Add CXX API for KittenTTS (#2469)
Add JavaScript API (node-addon) for KittenTTS (#2470)
Add JavaScript API (WebAssembly) for KittenTTS (#2471)
Add Pascal API for KittenTTS (#2474)
Add Dart API for KittenTTS (#2475)
Add Swift API for KittenTTS (#2476)
Add C# API for KittenTTS (#2477)
Add Go API for KittenTTS (#2478)

1.12.7 #

Support Portuguese and German ASR models from NeMo (#2394)
Support returning the current speech segment for VAD. (#2397)
Add more piper tts polish models (#2403)
Support VAD+ASR for WearOS (#2404)
Support test long audio with streaming-model & vad (#2405)
Fix typo in sherpa-onnx-vad-with-online-asr.cc (#2407)
Add tail padding for sherpa-onnx-vad-with-online-asr (#2408)
Add more French TTS models (#2424)
Add more piper tts models (#2425)
Implement max_symbols_per_frame for GigaAM2 accurate decoding since model uses char tokens instead of BPE. (#2423)
Fix GigaAM transducer encoder output length data type (#2426)
Add friendly log messages for Android and HarmonyOS TTS users. (#2427)
Fix setGraph in OnlineCtcFstDecoderConfig Java API (#2411)

1.12.6 #

Support silero-vad v4 exported by k2-fsa (#2372)
Add C++ and Python support for ten-vad (#2377)
Fix compile errors for Linux (#2378)
Add C API for ten-vad (#2379)
Add CXX API examples for ten-vad. (#2380)
Add JavaScript (WebAssembly) API for ten-vad (#2382)
Add JavaScript (node-addon) API for ten-vad (#2383)
Add Go API for ten-vad (#2384)
Add C# API for ten-vad (#2385)
Add Dart API for ten-vad (#2386)
Add Swift API for ten-vad (#2387)
Add Pascal API for ten-vad (#2388)
Add Java/Kotlin API and Android support for ten-vad (#2389)

1.12.5 #

Fix typo CMAKE_EXECUTBLE_LINKER_FLAGS -> CMAKE_EXECUTABLE_LINKER_FLAGS (#2344)
Fix testing dart packages (#2345)
fix(canary): use dynamo export, single input_ids and avoid 0/1 specialization (#2348)
Fix TTS for Unreal Engine (#2349)
Update readme to include https://github.com/mawwalker/stt-server (#2350)
Add meta data to NeMo canary ONNX models (#2351)
Update README to include https://github.com/bbeyondllove/asr_server (#2353)
Add C++ runtime and Python API for NeMo Canary models (#2352)
Add C/CXX/JavaScript API for NeMo Canary models (#2357)
Add Java and Kotlin API for NeMo Canary models (#2359)
Upload fp16 onnx model files for FireRedASR (#2360)
Fix nemo feature normalization in test code (#2361)
Refactor exporting NeMo models (#2362)
Add LODR support to online and offline recognizers (#2026)
Add CXX examples for NeMo TDT ASR. (#2363)
Add Pascal/Go/C#/Dart API for NeMo Canary ASR models (#2367)

1.12.4 #

Refactor release scripts. (#2323)
Add TTS engine APKs for more models (#2327)
Fix static link without tts (#2328)
Fix VAD+ASR C++ example. (#2335)
Add sherpa-onnx-streaming-zipformer-zh-int8-2025-06-30 to android ASR apk (#2336)
Support non-streaming zipformer CTC ASR models (#2340)
Support linux aarch64 for Dart and Flutter (#2342)

1.12.3 #

Show CMake debug information. (#2316)
Remove portaudio-go in Go API examples. (#2317)
Support Zipformer CTC ASR with whisper features. (#2319)
Support Zipformer transducer ASR with whisper features. (#2321)

1.12.2 #

Fix CI for windows (#2279)
Add jar for Java 24. (#2280)
Add Python API for source separation (#2283)
Add link to huggingface space for source separation. (#2284)
Fix isspace on windows in debug build (#2042)
Update wasm/vad-asr/assets/README.md for more clear (#2297)
Update TTS Engine APK to support multi-lang (#2294)
Add scripts for exporting Piper TTS models to sherpa-onnx (#2299)
Update sherpa-onnx-shared.pc.in (#2300)
Fixes #2172 (#2301)
Refactor kokoro export (#2302)
Fix building for Pascal (#2305)
Support extra languages in multi-lang kokoro tts (#2303)
Update readme to include BreezeApp from MediaTek Research. (#2313)
Add API to get version information (#2309)

1.12.1 #

Use jlong explicitly in jni. (#2229)
Fix building RKNN wheels (#2233)
Fix publishing binaries for RKNN (#2234)
Export spleeter model to onnx for source separation (#2237)
Add C++ runtime for spleeter about source separation (#2242)
Add include headers for ANDROID_API,OHOS (#2251)
JAVA-API: Manual Library Loading Support for Restricted Environments (#2253)
Build APK with replace.fst (#2254)
repair rknn wheels (#2257)
Update kaldi-native-fbank. (#2259)
Fix building sherpa-onnx (#2262)
Fix building MFC examples (#2263)
Add UVR models for source separation. (#2266)
move portaudio common record code to microphone (#2264)
fixed mfc build error (#2267)
Add C++ support for UVR models (#2269)
Export nvidia/canary-180m-flash to sherpa-onnx (#2272)
Update utils.dart (#2275)
Fix rknn for multi-threads (#2274)
Fix 32-bit arm CI (#2276)

1.12.0 #

Fix building wheels for macOS (#2192)
Show verbose logs in homophone replacer (#2194)
Fix displaying streaming speech recognition results for Python. (#2196)
Add real-time speech recognition example for SenseVoice. (#2197)
docs: add Open-XiaoAI KWS project (#2198)
Add C++ example for streaming ASR with SenseVoice. (#2199)
Add C++ example for real-time ASR with nvidia/parakeet-tdt-0.6b-v2. (#2201)
Add a link to YouTube video including sherpa-onnx. (#2202)
Support sending is_eof for online websocket server. (#2204)
Add alsa-based streaming ASR example for sense voice. (#2207)
Support homophone replacer in Android asr demo. (#2210)
Add Go implementation of the TTS generation callback (#2213)
Add Android demo for real-time ASR with non-streaming ASR models. (#2214)
Expose dither for JNI (#2215)
Add nodejs example for parakeet-tdt-0.6b-v2. (#2219)
Add script to build APK for simulated-streaming-asr. (#2220)

1.11.5 #

export parakeet-tdt-0.6b-v2 to sherpa-onnx (#2180)
Add C++ runtime for parakeet-tdt-0.6b-v2. (#2181)
Avoid NaN in feature normalization. (#2186)

1.11.4 #

Disable strict hotword matching mode for offline transducer (#1837)
Comment refinement: Add note about vocoder file for matcha TTS config (#2106)
Fix a typo in the JNI for Android. (#2108)
Generate subtitles with FireRedAsr models (#2112)
Use manylinux_2_28_x86_64 to build linux gpu for sherpa-onnx (#2123)
Support running sherpa-onnx with RK NPU on Android (#2124)
Fix building for HarmonyOS (#2125)
cmake build, configurable from env (#2115)
Expose dither in python API (#2127)
Add support for GigaAM-CTC-v2 (#2135)
Support Giga AM transducer V2 (#2136)
Export kokoro 1.0 int8 models (#2137)
Upload more onnx ASR models (#2141)
Fix building for open harmonyOS (#2142)
online-transducer: reset the encoder together with 2 previous output symbols (non-blank) (#2129)
Fix punctuations for kokoro tts 1.1-zh. (#2146)
Fix setting OnlineModelConfig in Java API (#2147)
Support decoding multiple streams in Java API. (#2149)
Support replacing homophonic phrases (#2153)
Add C and CXX API for homophone replacer (#2156)
Add JavaScript API (WASM) for homophone replacer (#2157)
Add JavaScript API (node-addon) for homophone replacer (#2158)
Fix building without TTS (#2159)
Add homophone replacer example for Python API. (#2161)
More fix for building without tts (#2162)
Add Swift API for homophone replacer. (#2164)
Add C# API for homophone replacer (#2165)
Add Kotlin and Java API for homophone replacer (#2166)
Add Dart API for homophone replacer (#2167)
Add Go API for homophone replacer (#2168)

1.11.3 #

fix vits dict dir config (#2036)
fix case (#2037)
Fix building wheels for RKNN (#2041)
Change scale factor to 32767 (#2056)
Fix length scale for kokoro tts (#2060)
Allow building repository as CMake subdirectory (#2059)
Export silero_vad v4 to RKNN (#2067)
fix dml with preinstall ort (#2066)
Fix building aar to include speech denoiser (#2069)
Add CXX API for VAD (#2077)
Add C++ runtime for silero_vad with RKNN (#2078)
Refactor rknn code (#2079)
Fix building for android (#2081)
Add C++ and Python API for Dolphin CTC models (#2085)
Add Kotlin and Java API for Dolphin CTC models (#2086)
Add C and CXX API for Dolphin CTC models (#2088)
Preserve more context after endpointing in transducer (#2061)
Add C# API for Dolphin CTC models (#2089)
Add Go API for Dolphin CTC models (#2090)
Add Swift API for Dolphin CTC models (#2091)
Add Javascript (WebAssembly) API for Dolphin CTC models (#2093)
Add Javascript (node-addon) API for Dolphin CTC models (#2094)
Add Dart API for Dolphin CTC models (#2095)
Add Pascal API for Dolphin CTC models (#2096)

1.11.2 #

Fix CI (#2016)
Publish jar for more java versions (#2017)
add alsa example for vad+offline asr (#2020)
Support cuda12 and cudnn8 for Linux aarch64. (#2021)
Update README to include more projects using sherpa-onnx (#2022)
Fix a bug in vad.reset() (#2023)
Fix Matcha + vocos for Android (#2024)
Fix crash in Android tts engine demo. (#2029)
Fix build script: add 'cd build' after 'mkdir build' to ensure the correct working directory for CMake (#2033)
fix static linking (#2032)

1.11.1 #

Export vocos to sherpa-onnx (#2012)
Add C++ runtime for vocos (#2014)

1.11.0 #

Fix building wheels for Python 3.7 (#1933)
Add Kotlin and Java API for online punctuation models (#1936)
Add Kokoro v1.1-zh (#1942)
Support RKNN for Zipformer CTC models. (#1948)
Add transducer modified_beam_search for RKNN. (#1949)
Update README to include projects that is using sherpa-onnx (#1956)
Limit number of tokens per second for whisper. (#1958)
Ebranchformer (#1951)
Test using sherpa-onnx as a cmake subproject (#1961)
Add C++ demo for VAD+non-streaming ASR (#1964)
Export gtcrn models to sherpa-onnx (#1975)
c-api add wave write to buffer. (#1962)
add SherpaOnnxOfflineRecognizerSetConfig binding for go, and optimize the new/free for C.struct_SherpaOnnxOfflineRecognizerConfig ptr (#1976)
Add C++ runtime for speech enhancement GTCRN models (#1977)
Add Python API for speech enhancement GTCRN models (#1978)
Add C API for speech enhancement GTCRN models (#1984)
Add CXX API for speech enhancement GTCRN models (#1986)
Add Swift API for speech enhancement GTCRN models (#1989)
Add C# API for speech enhancement GTCRN models (#1990)
Add Go API for speech enhancement GTCRN models (#1991)
Add Pascal API for speech enhancement GTCRN models (#1992)
Add Dart API for speech enhancement GTCRN models (#1993)
Add JavaScript (node-addon) API for speech enhancement GTCRN models (#1996)
Add WebAssembly (WASM) for speech enhancement GTCRN models (#2002)
Add JavaScript API (wasm) for speech enhancement GTCRN models (#2007)
Add Kotlin API for speech enhancement GTCRN models (#2008)
Add Java API for speech enhancement GTCRN models (#2009)

1.10.46 #

Fix kokoro lexicon. (#1886)
speaker-identification-with-vad-non-streaming-asr.py Lack of support for sense_voice. (#1884)
Fix generating Chinese lexicon for Kokoro TTS 1.0 (#1888)
Reduce vad-whisper-c-api example code. (#1891)
JNI Exception Handling (#1452)
Fix #1901: UnicodeEncodeError running export_bpe_vocab.py (#1902)
Fix publishing pre-built windows libraries (#1905)
Fixing Whisper Model Token Normalization (#1904)
feat: add mic example for better compatibility (#1909)
Add onnxruntime 1.18.1 for Linux aarch64 GPU (#1914)
Add C++ API for streaming zipformer ASR on RK NPU (#1908)
change [1<<28] to [1<<10], to fix build issues on GOARCH=386 that [1<<28] too large (#1916)
Flutter Config toJson/fromJson (#1893)
Fix publishing linux pre-built artifacts (#1919)
go.mod set to use go 1.17, and use unsafe.Slice to optimize the code (#1920)
fix: AddPunct panic for Go(#1921)
Fix publishing macos pre-built artifacts (#1922)
Minor fixes for rknn (#1925)
Build wheels for rknn linux aarch64 (#1928)

1.10.45 #

[update] fixed bug: create golang instance succeed while the c struct create failed (#1860)
fixed typo in RTF calculations (#1861)
Export FireRedASR to sherpa-onnx. (#1865)
Add C++ and Python API for FireRedASR AED models (#1867)
Add Kotlin and Java API for FireRedAsr AED model (#1870)
Add C API for FireRedAsr AED model. (#1871)
Add CXX API for FireRedAsr (#1872)
Add JavaScript API (node-addon) for FireRedAsr (#1873)
Add JavaScript API (WebAssembly) for FireRedAsr model. (#1874)
Add C# API for FireRedAsr Model (#1875)
Add C# API for FireRedAsr Model (#1875)
Add Swift API for FireRedAsr AED Model (#1876)
Add Dart API for FireRedAsr AED Model (#1877)
Add Go API for FireRedAsr AED Model (#1879)
Add Pascal API for FireRedAsr AED Model (#1880)

1.10.44 #

Export MatchaTTS fa-en model to sherpa-onnx (#1832)
Add C++ support for MatchaTTS models not from icefall. (#1834)
OfflineRecognizer supports create stream with hotwords (#1833)
Add PengChengStarling models to sherpa-onnx (#1835)
Support specifying voice in espeak-ng for kokoro tts models. (#1836)
Fix: made print sherpa_onnx_loge when it is in debug mode (#1838)
Add Go API for audio tagging (#1840)
Fix CI (#1841)
Update readme to contain links for pre-built Apps (#1853)
Modify the model used (#1855)
Flutter OnlinePunctuation (#1854)
Fix spliting text by languages for kokoro tts. (#1849)

1.10.43 #

Add MFC example for Kokoro TTS 1.0 (#1815)
Update sherpa-onnx-tts.js VitsModelConfig.model can be none (#1817)
Fix passing gb2312 encoded strings to tts on Windows (#1819)
Support scaling the duration of a pause in TTS. (#1820)
Fix building wheels for linux aarch64. (#1821)
Fix CI for Linux aarch64. (#1822)

1.10.42 #

Fix publishing wheels (#1746)
Update README to include https://github.com/xinhecuican/QSmartAssistant (#1755)
Add Kokoro TTS to MFC examples (#1760)
Refactor node-addon C++ code. (#1768)
Add keyword spotter C API for HarmonyOS (#1769)
Add ArkTS API for Keyword spotting. (#1775)
Add Flutter example for Kokoro TTS (#1776)
Initialize the audio session for iOS ASR example (#1786)
Fix: Prepend 0 to tokenization to prevent word skipping for Kokoro. (#1787)
Export Kokoro 1.0 to sherpa-onnx (#1788)
Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795)
Add Java and Kotlin API for Kokoro TTS 1.0 (#1798)
Add Android demo for Kokoro TTS 1.0 (#1799)
Add C API for Kokoro TTS 1.0 (#1801)
Add CXX API for Kokoro TTS 1.0 (#1802)
Add Swift API for Kokoro TTS 1.0 (#1803)
Add Go API for Kokoro TTS 1.0 (#1804)
Add C# API for Kokoro TTS 1.0 (#1805)
Add Dart API for Kokoro TTS 1.0 (#1806)
Add Pascal API for Kokoro TTS 1.0 (#1807)
Add JavaScript API (node-addon) for Kokoro TTS 1.0 (#1808)
Add JavaScript API (WebAssembly) for Kokoro TTS 1.0 (#1809)
Add Flutter example for Kokoro TTS 1.0 (#1810)
Add iOS demo for Kokoro TTS 1.0 (#1812)
Add HarmonyOS demo for Kokoro TTS 1.0 (#1813)

1.10.41 #

Fix UI for Android TTS Engine. (#1735)
Add iOS TTS example for MatchaTTS (#1736)
Add iOS example for Kokoro TTS (#1737)
Fix dither binding in Pybind11 to ensure independence from high_freq in FeatureExtractorConfig (#1739)
Fix keyword spotting. (#1689)
Update readme to include https://github.com/hfyydd/sherpa-onnx-server (#1741)
Reduce vad-moonshine-c-api example code. (#1742)
Support Kokoro TTS for HarmonyOS. (#1743)

1.10.40 #

Fix building wheels (#1703)
Export kokoro to sherpa-onnx (#1713)
Add C++ and Python API for Kokoro TTS models. (#1715)
Add C API for Kokoro TTS models (#1717)
Fix style issues (#1718)
Add C# API for Kokoro TTS models (#1720)
Add Swift API for Kokoro TTS models (#1721)
Add Go API for Kokoro TTS models (#1722)
Add Dart API for Kokoro TTS models (#1723)
Add Pascal API for Kokoro TTS models (#1724)
Add JavaScript API (node-addon) for Kokoro TTS models (#1725)
Add JavaScript (WebAssembly) API for Kokoro TTS models. (#1726)
Add Kotlin and Java API for Kokoro TTS models (#1728)
Update README.md for KWS to not use git lfs. (#1729)

1.10.39 #

Fix building without TTS (#1691)
Add README for android libs. (#1693)
Fix: export-onnx.py(expected all tensors to be on the same device) (#1699)
Fix passing strings from C# to C. (#1701)

1.10.38 #

Fix initializing TTS in Python. (#1664)
Remove spaces after punctuations for TTS (#1666)
Add constructor fromPtr() for all flutter class with factory ctor. (#1667)
Add Kotlin API for Matcha-TTS models. (#1668)
Support Matcha-TTS models using espeak-ng (#1672)
Add Java API for Matcha-TTS models. (#1673)
Avoid adding tail padding for VAD in generate-subtitles.py (#1674)
Add C API for MatchaTTS models (#1675)
Add CXX API for MatchaTTS models (#1676)
Add JavaScript API (node-addon-api) for MatchaTTS models. (#1677)
Add HarmonyOS examples for MatchaTTS. (#1678)
Upgraded to .NET 8 and made code style a little more internally consistent. (#1680)
Update workflows to use .NET 8.0 also. (#1681)
Add C# and JavaScript (wasm) API for MatchaTTS models (#1682)
Add Android demo for MatchaTTS models. (#1683)
Add Swift API for MatchaTTS models. (#1684)
Add Go API for MatchaTTS models (#1685)
Add Pascal API for MatchaTTS models. (#1686)
Add Dart API for MatchaTTS models (#1687)

1.10.37 #

Add new tts models for Latvia and Persian+English (#1644)
Add a byte-level BPE Chinese+English non-streaming zipformer model (#1645)
Support removing invalid utf-8 sequences. (#1648)
Add TeleSpeech CTC to non_streaming_server.py (#1649)
Fix building macOS libs (#1656)
Add Go API for Keyword spotting (#1662)
Add Swift online punctuation (#1661)
Add C++ runtime for Matcha-TTS (#1627)

1.10.36 #

Update AAR version in Android Java demo (#1618)
Support linking onnxruntime statically for Android (#1619)
Update readme to include Open-LLM-VTuber (#1622)
Rename maxNumStences to maxNumSentences (#1625)
Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630)
Update readme to include jetson orin nx and nano b01 (#1631)
feat: add checksum action (#1632)
Support decoding with byte-level BPE (bbpe) models. (#1633)
feat: enable c api for android ci (#1635)
Update README.md (#1640)
SherpaOnnxVadAsr: Offload runSecondPass to background thread for improved real-time audio processing (#1638)
Fix GitHub actions. (#1642)

1.10.35 #

Add missing changes about speaker identification demo for HarmonyOS (#1612)
Provide sherpa-onnx.aar for Android (#1615)
Use aar in Android Java demo. (#1616)

1.10.34 #

Fix building node-addon package (#1598)
Update doc links for HarmonyOS (#1601)
Add on-device real-time ASR demo for HarmonyOS (#1606)
Add speaker identification APIs for HarmonyOS (#1607)
Add speaker identification demo for HarmonyOS (#1608)
Add speaker diarization API for HarmonyOS. (#1609)
Add speaker diarization demo for HarmonyOS (#1610)

1.10.33 #

Add non-streaming ASR support for HarmonyOS. (#1564)
Add streaming ASR support for HarmonyOS. (#1565)
Fix building for Android (#1568)
Publish sherpa_onnx.har for HarmonyOS (#1572)
Add VAD+ASR demo for HarmonyOS (#1573)
Fix publishing har packages for HarmonyOS (#1576)
Add CI to build HAPs for HarmonyOS (#1578)
Add microphone demo about VAD+ASR for HarmonyOS (#1581)
Fix getting microphone permission for HarmonyOS VAD+ASR example (#1582)
Add HarmonyOS support for text-to-speech. (#1584)
Fix: support both old and new websockets request headers format (#1588)
Add on-device text-to-speech (TTS) demo for HarmonyOS (#1590)

1.10.32 #

Support cross-compiling for HarmonyOS (#1553)
HarmonyOS support for VAD. (#1561)
Fix publishing flutter iOS app to appstore (#1563).

1.10.31 #

Publish pre-built wheels for Python 3.13 (#1485)
Publish pre-built macos xcframework (#1490)
Fix reading tokens.txt on Windows. (#1497)
Add two-pass ASR Android APKs for Moonshine models. (#1499)
Support building GPU-capable sherpa-onnx on Linux aarch64. (#1500)
Publish pre-built wheels with CUDA support for Linux aarch64. (#1507)
Export the English TTS model from MeloTTS (#1509)
Add Lazarus example for Moonshine models. (#1532)
Add isolate_tts demo (#1529)
Add WebAssembly example for VAD + Moonshine models. (#1535)
Add Android APK for streaming Paraformer ASR (#1538)
Support static build for windows arm64. (#1539)
Use xcframework for Flutter iOS plugin to support iOS simulators.

1.10.30 #

Fix building node-addon for Windows x86. (#1469)
Begin to support https://github.com/usefulsensors/moonshine (#1470)
Publish pre-built JNI libs for Linux aarch64 (#1472)
Add C++ runtime and Python APIs for Moonshine models (#1473)
Add Kotlin and Java API for Moonshine models (#1474)
Add C and C++ API for Moonshine models (#1476)
Add Swift API for Moonshine models. (#1477)
Add Go API examples for adding punctuations to text. (#1478)
Add Go API for Moonshine models (#1479)
Add JavaScript API for Moonshine models (#1480)
Add Dart API for Moonshine models. (#1481)
Add Pascal API for Moonshine models (#1482)
Add C# API for Moonshine models. (#1483)

1.10.29 #

Add Go API for offline punctuation models (#1434)
Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437)
Add more models for speaker diarization (#1440)
Add Java API example for hotwords. (#1442)
Add java android demo (#1454)
Add C++ API for streaming ASR. (#1455)
Add C++ API for non-streaming ASR (#1456)
Handle NaN embeddings in speaker diarization. (#1461)
Add speaker identification with VAD and non-streaming ASR using ALSA (#1463)
Support GigaAM CTC models for Russian ASR (#1464)
Add GigaAM NeMo transducer model for Russian ASR (#1467)

1.10.28 #

Fix swift example for generating subtitles. (#1362)
Allow more online models to load tokens file from the memory (#1352)
Fix CI errors introduced by supporting loading keywords from buffers (#1366)
Fix running MeloTTS models on GPU. (#1379)
Support Parakeet models from NeMo (#1381)
Export Pyannote speaker segmentation models to onnx (#1382)
Support Agglomerative clustering. (#1384)
Add Python API for clustering (#1385)
support whisper turbo (#1390)
context_state is not set correctly when previous context is passed after reset (#1393)
Speaker diarization example with onnxruntime Python API (#1395)
C++ API for speaker diarization (#1396)
Python API for speaker diarization. (#1400)
C API for speaker diarization (#1402)
docs(nodejs-addon-examples): add guide for pnpm user (#1401)
Go API for speaker diarization (#1403)
Swift API for speaker diarization (#1404)
Update readme to include more external projects using sherpa-onnx (#1405)
C# API for speaker diarization (#1407)
JavaScript API (node-addon) for speaker diarization (#1408)
WebAssembly example for speaker diarization (#1411)
Handle audio files less than 10s long for speaker diarization. (#1412)
JavaScript API with WebAssembly for speaker diarization (#1414)
Kotlin API for speaker diarization (#1415)
Java API for speaker diarization (#1416)
Dart API for speaker diarization (#1418)
Pascal API for speaker diarization (#1420)
Android JNI support for speaker diarization (#1421)
Android demo for speaker diarization (#1423)

1.10.27 #

Add non-streaming ONNX models for Russian ASR (#1358)
Fix building Flutter TTS examples for Linux (#1356)
Support passing utf-8 strings from JavaScript to C++. (#1355)
Fix sherpa_onnx.go to support returning empty recognition results (#1353)

1.10.26 #

Add links to projects using sherpa-onnx. (#1345)
Support lang/emotion/event results from SenseVoice in Swift API. (#1346)
Support specifying max speech duration for VAD. (#1348)
Add APIs about max speech duration in VAD for various programming languages (#1349)

1.10.25 #

Allow tokens and hotwords to be loaded from buffered string directly (#1339)
Fix computing features for CED audio tagging models. (#1341)
Preserve previous result as context for next segment (#1335)
Add Python binding for online punctuation models (#1312)
Fix vad.Flush(). (#1329)
Fix wasm app for streaming paraformer (#1328)
Build websocket related binaries for embedded systems. (#1327)
Fixed the C api calls and created the TTS project file (#1324)
Re-implement LM rescore for online transducer (#1231)

1.10.24 #

Add VAD and keyword spotting for the Node package with WebAssembly (#1286)
Fix releasing npm package and fix building Android VAD+ASR example (#1288)
add Tokens []string, Timestamps []float32, Lang string, Emotion string, Event string (#1277)
add vad+sense voice example for C API (#1291)
ADD VAD+ASR example for dart with CircularBuffer. (#1293)
Fix VAD+ASR example for Dart API. (#1294)
Avoid SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches freeing null. (#1296)
Fix releasing wasm app for vad+asr (#1300)
remove extra files from linux/macos/windows jni libs (#1301)
two-pass Android APK for SenseVoice (#1302)
Downgrade flutter sdk versions. (#1305)
Reduce onnxruntime log output. (#1306)
Provide prebuilt .jar files for different java versions. (#1307)

1.10.23 #

flutter: add lang, emotion, event to OfflineRecognizerResult (#1268)
Use a separate thread to initialize models for lazarus examples. (#1270)
Object pascal examples for recording and playing audio with portaudio. (#1271)
Text to speech API for Object Pascal. (#1273)
update kotlin api for better release native object and add user-friendly apis. (#1275)
Update wave-reader.cc to support 8/16/32-bit waves (#1278)
Add WebAssembly for VAD (#1281)
WebAssembly example for VAD + Non-streaming ASR (#1284)

1.10.22 #

Add Pascal API for reading wave files (#1243)
Pascal API for streaming ASR (#1246)
Pascal API for non-streaming ASR (#1247)
Pascal API for VAD (#1249)
Add more C API examples (#1255)
Add emotion, event of SenseVoice. (#1257)
Support reading multi-channel wave files with 8/16/32-bit encoded samples (#1258)
Enable IPO only for Release build. (#1261)
Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251)
Fix looking up OOVs in lexicon.txt for MeloTTS models. (#1266)

1.10.21 #

Fix ffmpeg c api example (#1185)
Fix splitting sentences for MeloTTS (#1186)
Non-streaming WebSocket client for Java. (#1190)
Fix copying asset files for flutter examples. (#1191)
Add Chinese+English tts example for flutter (#1192)
Add speaker identification and verification example for Dart API (#1194)
Fix reading non-standard wav files. (#1199)
Add ReazonSpeech Japanese pre-trained model (#1203)
Describe how to add new words for MeloTTS models (#1209)
Remove libonnxruntime_providers_cuda.so as a dependency. (#1210)
Fix setting SenseVoice language. (#1214)
Support passing TTS callback in Swift API (#1218)
Add MeloTTS example for ios (#1223)
Add online punctuation and casing prediction model for English language (#1224)
Fix python two pass ASR examples (#1230)
Add blank penalty for various language bindings

1.10.20 #

Add Dart API for audio tagging
Add Dart API for adding punctuations to text

1.10.19 #

Prefix all C API functions with SherpaOnnx

1.10.18 #

Fix the case when recognition results contain the symbol ". It caused issues when converting results to a json string.

1.10.17 #

Support SenseVoice CTC models.
Add Dart API for keyword spotter.

1.10.16 #

Support zh-en TTS model from MeloTTS.

1.10.15 #

Downgrade onnxruntime from v1.18.1 to v1.17.1

1.10.14 #

Support whisper large v3
Update onnxruntime from v1.18.0 to v1.18.1
Fix invalid utf8 sequence from Whisper for Dart API.

1.10.13 #

Update onnxruntime from 1.17.1 to 1.18.0
Add C# API for Keyword spotting

1.10.12 #

Add Flush to VAD so that the last speech segment can be detected. See also https://github.com/k2-fsa/sherpa-onnx/discussions/1077#discussioncomment-9979740

1.10.11 #

Support the iOS platform for Flutter.

1.10.10 #

Build sherpa-onnx into a single shared library.

1.10.9 #

Fix released packages. piper-phonemize was not included in v1.10.8.

1.10.8 #

Fix released packages. There should be a lib directory.

1.10.7 #

Support Android for Flutter.

1.10.2 #

Fix passing C# string to C++

1.10.1 #

Enable to stop TTS generation

1.10.0 #

Add inverse text normalization

1.9.30 #

Add TTS

1.9.29 #

Publish with CI

0.0.3 #

Fix path separator on Windows.

0.0.2 #

Support specifying lib path.

0.0.1 #

Initial release.

sherpa_onnx 1.13.0 sherpa_onnx: ^1.13.0 copied to clipboard

Metadata

1.13.0 #

1.12.40 #

1.12.39 #

1.12.38 #

1.12.37 #

1.12.36 #

1.12.35 #

1.12.34 #

1.12.33 #

1.12.32 #

1.12.31 #

1.12.30 #

1.12.29 #

1.12.28 #

1.12.27 #

1.12.26 #

1.12.25 #

1.12.24 #

1.12.23 #

1.12.22 #

1.12.21 #

1.12.20 #

1.12.19 #

1.12.18 #

1.12.17 #

1.12.16 #

1.12.15 #

1.12.14 #

1.12.13 #

1.12.12 #

1.12.11 #

1.12.10 #

1.12.9 #

1.12.8 #

1.12.7 #

1.12.6 #

1.12.5 #

1.12.4 #

1.12.3 #

1.12.2 #

1.12.1 #

1.12.0 #

1.11.5 #

1.11.4 #

1.11.3 #

1.11.2 #

1.11.1 #

1.11.0 #

1.10.46 #

1.10.45 #

1.10.44 #

1.10.43 #

1.10.42 #

1.10.41 #

1.10.40 #

1.10.39 #

1.10.38 #

1.10.37 #

1.10.36 #

1.10.35 #

1.10.34 #

1.10.33 #

1.10.32 #

1.10.31 #

1.10.30 #

1.10.29 #

1.10.28 #

1.10.27 #

1.10.26 #

1.10.25 #

1.10.24 #

1.10.23 #

1.10.22 #

1.10.21 #

1.10.20 #

1.10.19 #

1.10.18 #

1.10.17 #

sherpa_onnx 1.13.0
sherpa_onnx: ^1.13.0 copied to clipboard