sherpa_onnx 1.10.23 | Flutter package

Supported functions #

Speech recognition	Speech synthesis	Speaker verification	Speaker identification
✔️	✔️	✔️	✔️

Spoken Language identification	Audio tagging	Voice activity detection
✔️	✔️	✔️

Keyword spotting	Add punctuation
✔️	✔️

Supported platforms #

Architecture	Android	iOS	Windows	macOS	linux
x64	✔️		✔️	✔️	✔️
x86	✔️		✔️
arm64	✔️	✔️	✔️	✔️	✔️
arm32	✔️				✔️
riscv64					✔️

Supported programming languages #

1. C++	2. C	3. Python	4. JavaScript
✔️	✔️	✔️	✔️

5. Java	6. C#	7. Kotlin	8. Swift
✔️	✔️	✔️	✔️

9. Go	10. Dart	11. Rust	12. Pascal
✔️	✔️	✔️	✔️

For Rust support, please see sherpa-rs

It also supports WebAssembly.

Introduction #

This repository supports running the following functions locally

Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
Text-to-speech (i.e., TTS)
Speaker identification
Speaker verification
Spoken language identification
Audio tagging
VAD (e.g., silero-vad)
Keyword spotting

on the following platforms and operating systems:

x86, x86_64, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
Linux, macOS, Windows, openKylin
Android, WearOS
iOS
NodeJS
WebAssembly
Raspberry Pi
RV1126
LicheePi4A
VisionFive 2
旭日X3派
爱芯派
etc

with the following APIs

C++, C, Python, Go, C#
Java, Kotlin, JavaScript
Swift, Rust
Dart, Object Pascal

Links for Huggingface Spaces #

You can visit the following Huggingface spaces to try sherpa-onnx without installing anything. All you need is a browser.

Description	URL
Speech recognition	Click me
Speech recognition with Whisper	Click me
Speech synthesis	Click me
Generate subtitles	Click me
Audio tagging	Click me
Spoken language identification with Whisper	Click me

We also have spaces built using WebAssembly. The are listed below:

Description	Huggingface space	ModelScope space
Voice activity detection with silero-vad	Click me	地址
Real-time speech recognition (Chinese + English) with Zipformer	Click me	地址
Real-time speech recognition (Chinese + English) with Paraformer	Click me	地址
Real-time speech recognition (Chinese + English + Cantonese) with Paraformer-large	Click me	地址
Real-time speech recognition (English)	Click me	地址
VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with SenseVoice	Click me	地址
VAD + speech recognition (English) with Whisper tiny.en	Click me	地址
VAD + speech recognition (English) with Zipformer trained with GigaSpeech	Click me	地址
VAD + speech recognition (Chinese) with Zipformer trained with WenetSpeech	Click me	地址
VAD + speech recognition (Japanese) with Zipformer trained with ReazonSpeech	Click me	地址
VAD + speech recognition (Thai) with Zipformer trained with GigaSpeech2	Click me	地址
VAD + speech recognition (Chinese 多种方言) with a TeleSpeech-ASR CTC model	Click me	地址
VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large	Click me	地址
VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small	Click me	地址
Speech synthesis (English)	Click me	地址
Speech synthesis (German)	Click me	地址

Links for pre-built Android APKs #

Description	URL	中国用户
Streaming speech recognition	Address	点此
Text-to-speech	Address	点此
Voice activity detection (VAD)	Address	点此
VAD + non-streaming speech recognition	Address	点此
Two-pass speech recognition	Address	点此
Audio tagging	Address	点此
Audio tagging (WearOS)	Address	点此
Speaker identification	Address	点此
Spoken language identification	Address	点此
Keyword spotting	Address	点此

Links for pre-built Flutter APPs #

Real-time speech recognition

Description	URL	中国用户
Streaming speech recognition	Address	点此

Text-to-speech

Description	URL	中国用户
Android (arm64-v8a, armeabi-v7a, x86_64)	Address	点此
Linux (x64)	Address	点此
macOS (x64)	Address	点此
macOS (arm64)	Address	点此
Windows (x64)	Address	点此

Note: You need to build from source for iOS.

Links for pre-built Lazarus APPs #

Generating subtitles

Description	URL	中国用户
Generate subtitles (生成字幕)	Address	点此

Links for pre-trained models #

Description	URL
Speech recognition (speech to text, ASR)	Address
Text-to-speech (TTS)	Address
VAD	Address
Keyword spotting	Address
Audio tagging	Address
Speaker identification (Speaker ID)	Address
Spoken language identification (Language ID)	See multi-lingual Whisper ASR models from Speech recognition
Punctuation	Address

Useful links #

Documentation: https://k2-fsa.github.io/sherpa/onnx/
Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi

How to reach us #

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.

sherpa_onnx 1.10.23
sherpa_onnx: ^1.10.23 copied to clipboard

Metadata

Supported functions #

Supported platforms #

Supported programming languages #

Introduction #

Links for Huggingface Spaces #

Links for pre-built Android APKs #

Links for pre-built Flutter APPs #

Real-time speech recognition

Text-to-speech

Links for pre-built Lazarus APPs #

Generating subtitles

Links for pre-trained models #

Useful links #

How to reach us #

← Metadata

Publisher

Metadata

Topics

Documentation

License

Dependencies

More

sherpa_onnx 1.10.23 sherpa_onnx: ^1.10.23 copied to clipboard

Metadata

Supported functions #

Supported platforms #

Supported programming languages #

Introduction #

Links for Huggingface Spaces #

Links for pre-built Android APKs #

Links for pre-built Flutter APPs #

Real-time speech recognition

Text-to-speech

Links for pre-built Lazarus APPs #

Generating subtitles

Links for pre-trained models #

Useful links #

How to reach us #

← Metadata

Publisher

Metadata

Topics

Documentation

License

Dependencies

More

sherpa_onnx 1.10.23
sherpa_onnx: ^1.10.23 copied to clipboard