sherpa_onnx 1.10.22 copy "sherpa_onnx: ^1.10.22" to clipboard
sherpa_onnx: ^1.10.22 copied to clipboard

Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection.

Supported functions #

Speech recognition Speech synthesis Speaker verification Speaker identification
✔️ ✔️ ✔️ ✔️
Spoken Language identification Audio tagging Voice activity detection
✔️ ✔️ ✔️
Keyword spotting Add punctuation
✔️ ✔️

Supported platforms #

Architecture Android iOS Windows macOS linux
x64 ✔️ ✔️ ✔️ ✔️
x86 ✔️ ✔️
arm64 ✔️ ✔️ ✔️ ✔️ ✔️
arm32 ✔️ ✔️
riscv64 ✔️

Supported programming languages #

1. C++ 2. C 3. Python 4. JavaScript
✔️ ✔️ ✔️ ✔️
5. Java 6. C# 7. Kotlin 8. Swift
✔️ ✔️ ✔️ ✔️
9. Go 10. Dart 11. Rust 12. Pascal
✔️ ✔️ ✔️ ✔️

For Rust support, please see https://github.com/thewh1teagle/sherpa-rs

It also supports WebAssembly.

Introduction #

This repository supports running the following functions locally

  • Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
  • Text-to-speech (i.e., TTS)
  • Speaker identification
  • Speaker verification
  • Spoken language identification
  • Audio tagging
  • VAD (e.g., silero-vad)
  • Keyword spotting

on the following platforms and operating systems:

with the following APIs

  • C++, C, Python, Go, C#
  • Java, Kotlin, JavaScript
  • Swift, Rust
  • Dart, Object Pascal
Description URL 中国用户
Streaming speech recognition Address 点此
Text-to-speech Address 点此
Voice activity detection (VAD) Address 点此
VAD + non-streaming speech recognition Address 点此
Two-pass speech recognition Address 点此
Audio tagging Address 点此
Audio tagging (WearOS) Address 点此
Speaker identification Address 点此
Spoken language identification Address 点此
Keyword spotting Address 点此

Real-time speech recognition

Description URL 中国用户
Streaming speech recognition Address 点此

Text-to-speech

Description URL 中国用户
Android (arm64-v8a, armeabi-v7a, x86_64) Address 点此
Linux (x64) Address 点此
macOS (x64) Address 点此
macOS (arm64) Address 点此
Windows (x64) Address 点此

Note: You need to build from source for iOS.

Generating subtitles

Description URL 中国用户
Generate subtitles (生成字幕) Address 点此
Description URL
Speech recognition (speech to text, ASR) Address
Text-to-speech (TTS) Address
VAD Address
Keyword spotting Address
Audio tagging Address
Speaker identification (Speaker ID) Address
Spoken language identification (Language ID) See multi-lingual Whisper ASR models from Speech recognition
Punctuation Address

How to reach us #

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.

19
likes
0
pub points
83%
popularity

Publisher

unverified uploader

Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection.

Homepage
Repository (GitHub)
View/report issues

Topics

#speech-recognition #speech-synthesis #speaker-identification #audio-tagging #voice-activity-detection

Documentation

Documentation

License

unknown (license)

Dependencies

ffi, flutter, sherpa_onnx_android, sherpa_onnx_ios, sherpa_onnx_linux, sherpa_onnx_macos, sherpa_onnx_windows

More

Packages that depend on sherpa_onnx