flutter_whisper_kit
A Flutter plugin that provides on-device speech recognition capabilities using WhisperKit. Achieve high-quality speech-to-text transcription while maintaining privacy.
Features
- 🔒 Complete On-Device Processing - No data is sent to external servers
- 🎯 High-Accuracy Speech Recognition - High-quality transcription with Whisper models
- 📱 Multiple Model Sizes - Choose from tiny to large based on your needs
- 🎙️ Real-Time Transcription - Convert microphone audio in real-time
- 📁 File-Based Transcription - Support for audio file transcription
- 📊 Progress Tracking - Monitor model download progress
- 🌍 Multi-Language Support - Supports 100+ languages
- ⚡ Type-Safe Error Handling - Safe error handling with Result type
Platform Support
Platform | Minimum Version | Status |
---|---|---|
iOS | 16.0+ | ✅ Fully Supported |
macOS | 13.0+ | ✅ Fully Supported |
Android | - | 🚧 Planned for Future Release |
Installation
Add to your pubspec.yaml
:
dependencies:
flutter_whisper_kit: ^0.2.0
iOS Configuration
Add these permissions to your iOS app's ios/Runner/Info.plist
:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access to record audio for speech transcription</string>
<key>NSDownloadsFolderUsageDescription</key>
<string>This app needs to access your Downloads folder to store WhisperKit models</string>
<key>NSDocumentsFolderUsageDescription</key>
<string>This app needs to access your Documents folder to store WhisperKit models</string>
macOS Configuration
Add these permissions to your macOS app's macos/Runner/Info.plist
:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access to record audio for speech transcription</string>
<key>NSLocalNetworkUsageDescription</key>
<string>This app needs to access your local network to download WhisperKit models</string>
<key>NSDownloadsFolderUsageDescription</key>
<string>This app needs to access your Downloads folder to store WhisperKit models</string>
<key>NSDocumentsFolderUsageDescription</key>
<string>This app needs to access your Documents folder to store WhisperKit models</string>
Also, ensure your macOS deployment target is set to 13.0 or higher in macos/Runner.xcodeproj/project.pbxproj
:
MACOSX_DEPLOYMENT_TARGET = 13.0;
Usage
Basic Usage
import 'package:flutter_whisper_kit/flutter_whisper_kit.dart';
// Create an instance of the plugin
final whisperKit = FlutterWhisperKit();
// Load a model
final result = await whisperKit.loadModel(
'tiny', // Model size: tiny, base, small, medium, large-v2, large-v3
modelRepo: 'argmaxinc/whisperkit-coreml',
);
print('Model loaded: $result');
// Transcribe from audio file
final transcription = await whisperKit.transcribeFromFile(
'/path/to/audio/file.mp3',
options: DecodingOptions(
task: DecodingTask.transcribe,
language: 'en', // Specify language (null for auto-detection)
),
);
print('Transcription: ${transcription?.text}');
Real-Time Transcription
// Listen to transcription stream
whisperKit.transcriptionStream.listen((transcription) {
print('Real-time transcription: ${transcription.text}');
});
// Start recording
await whisperKit.startRecording(
options: DecodingOptions(
task: DecodingTask.transcribe,
language: 'en',
),
);
// Stop recording
final finalTranscription = await whisperKit.stopRecording();
print('Final transcription: ${finalTranscription?.text}');
Error Handling (Result Type)
Since v0.2.0, Result type APIs have been added for safer error handling:
// Load model with Result type
final loadResult = await whisperKit.loadModelWithResult(
'tiny',
modelRepo: 'argmaxinc/whisperkit-coreml',
);
loadResult.when(
success: (modelPath) {
print('Model loaded successfully: $modelPath');
},
failure: (error) {
print('Model loading failed: ${error.message}');
// Handle errors by error code
switch (error.code) {
case WhisperKitErrorCode.modelNotFound:
// Handle model not found
break;
case WhisperKitErrorCode.networkError:
// Handle network error
break;
default:
// Handle other errors
}
},
);
// Transcribe with Result type
final transcribeResult = await whisperKit.transcribeFileWithResult(
audioPath,
options: DecodingOptions(language: 'en'),
);
// Handle success/failure with fold method
final text = transcribeResult.fold(
onSuccess: (result) => result?.text ?? 'No result',
onFailure: (error) => 'Error: ${error.message}',
);
Model Management
// Fetch available models
final models = await whisperKit.fetchAvailableModels(
modelRepo: 'argmaxinc/whisperkit-coreml',
);
// Get recommended models
final recommended = await whisperKit.recommendedModels();
print('Recommended model: ${recommended?.defaultModel}');
// Download model with progress
await whisperKit.download(
variant: 'base',
repo: 'argmaxinc/whisperkit-coreml',
onProgress: (progress) {
print('Download progress: ${(progress.fractionCompleted * 100).toStringAsFixed(1)}%');
},
);
// Monitor model progress stream
whisperKit.modelProgressStream.listen((progress) {
print('Model progress: ${progress.fractionCompleted * 100}%');
});
Language Detection
// Detect language from audio file
final detection = await whisperKit.detectLanguage(audioPath);
print('Detected language: ${detection?.language}');
print('Confidence: ${detection?.probabilities[detection.language]}');
Advanced Configuration
// Custom decoding options
final options = DecodingOptions(
verbose: true, // Enable verbose logging
task: DecodingTask.transcribe, // transcribe or translate
language: 'en', // Language code (null for auto-detection)
temperature: 0.0, // Sampling temperature (0.0-1.0)
temperatureFallbackCount: 5, // Temperature fallback count
wordTimestamps: true, // Enable word timestamps
chunkingStrategy: ChunkingStrategy.vad, // Chunking strategy
);
// Detailed transcription results
final result = await whisperKit.transcribeFromFile(audioPath, options: options);
if (result != null) {
print('Text: ${result.text}');
print('Language: ${result.language}');
// Segment information
for (final segment in result.segments) {
print('Segment ${segment.id}: ${segment.text}');
print(' Start: ${segment.startTime}s, End: ${segment.endTime}s');
// Word timing information (if wordTimestamps: true)
for (final word in segment.words) {
print(' Word: ${word.word} (${word.start}s - ${word.end}s)');
}
}
}
Model Size Selection
Choose the appropriate model size based on your use case:
Model | Size | Speed | Accuracy | Use Case |
---|---|---|---|---|
tiny | ~39MB | Very Fast | Low | Real-time processing, battery-conscious |
tiny-en | ~39MB | Very Fast | Low (English only) | English-only real-time processing |
base | ~145MB | Fast | Medium | Balanced performance |
small | ~466MB | Medium | High | When higher accuracy is needed |
medium | ~1.5GB | Slow | Higher | When even higher accuracy is needed |
large-v2 | ~2.9GB | Very Slow | Very High | When maximum accuracy is needed |
large-v3 | ~2.9GB | Very Slow | Highest | Latest and highest accuracy |
Example App
The example
folder contains a sample app that demonstrates all features:
cd packages/flutter_whisper_kit/example
flutter run
Troubleshooting
Build errors on iOS/macOS
- Check minimum deployment target (iOS 16.0+, macOS 13.0+)
- Update to the latest Xcode
- Run
pod install
Model download fails
- Check network connection
- Ensure sufficient storage space
- Try
redownload: true
option
Low transcription accuracy
- Try a larger model size
- Explicitly specify language with
language
parameter - Adjust
temperature
parameter (0.0 for more deterministic, 1.0 for more creative)
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contributing
Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.
Acknowledgments
This plugin is based on WhisperKit. Thanks to the Argmax Inc. team for providing this excellent library.