Whisper GGML Plus

On-device Whisper.cpp transcription for Flutter on Android, iOS, Linux, macOS, and Windows.

whisper_ggml_plus is for Flutter apps that want local Whisper inference without sending audio to a remote service. It is built around file-based transcription, so the package works best when your app can provide a finished audio file path and let Whisper process it in batch.

Why use `whisper_ggml_plus`

The goal of this package is to give Flutter developers a practical Whisper.cpp integration that feels native to a Flutter app, while still exposing the important controls that affect quality, speed, and platform behavior.

Cross-platform Flutter FFI plugin for on-device transcription
Whisper.cpp v1.8.3 engine with Large-v3-Turbo support
File-based batch transcription from local audio files
Optional audio conversion via the whisper_ggml_plus_ffmpeg companion package
Configurable VAD through WhisperVadMode
Optional Metal and CoreML acceleration on iOS and macOS
Windows support is available in the current package release

Supported platforms

The package currently declares support for all of the platforms below. Apple platforms can optionally use extra acceleration paths, while the other platforms use the standard FFI flow.

Platform	Support	Notes
Android	✅	File-based transcription
iOS	✅	Metal, optional CoreML
Linux	✅	FFI plugin
macOS	✅	Metal, optional CoreML
Windows	✅	FFI plugin, example support

What this package does

At a high level, the package takes an audio file path, prepares the matching GGML model path, and runs Whisper.cpp locally through FFI. If your app already works with saved audio files, this package is designed for that workflow.

Transcribes completed audio files from audioPath
Works with predefined WhisperModel enums
Can download official GGML models with WhisperController.downloadModel()
Supports timestamps, splitOnWord, VAD, abort(), and dispose()

What this package does not do yet

It is useful to be explicit about the current scope so users know where the package boundary is.

Stream partial transcription tokens while recording
Provide a built-in microphone capture UI
Bundle CoreML .mlmodelc directories through Flutter assets

Quick start

The simplest way to think about setup is this:

add the core package
make sure you have a GGML model file available
pass a local audio file path into transcribe(...)

If your app already produces 16 kHz mono WAV files, the core package is often enough on its own.

Install the core package

flutter pub add whisper_ggml_plus

If your audio is already 16 kHz mono WAV

Use the core package directly and call transcribe(...) with the WAV file path.

import 'package:whisper_ggml_plus/whisper_ggml_plus.dart';

final controller = WhisperController();

final result = await controller.transcribe(
  model: WhisperModel.base,
  audioPath: wavPath,
  lang: 'auto',
);

If your audio is MP3, M4A, MP4, or another format

The core package no longer bundles FFmpeg. That keeps the base package smaller and avoids forcing one conversion strategy on every app.

If your app needs automatic conversion, register a converter from the companion package once at app startup.

flutter pub add whisper_ggml_plus_ffmpeg

import 'package:whisper_ggml_plus/whisper_ggml_plus.dart';
import 'package:whisper_ggml_plus_ffmpeg/whisper_ggml_plus_ffmpeg.dart';

void main() {
  WhisperFFmpegConverter.register();
  runApp(const MyApp());
}

After that, you can keep calling transcribe(...) with common audio formats and let the registered converter handle preprocessing before Whisper runs.

Model setup

The package expects a GGML .bin model file in app-writable storage. In practice, most apps do one of two things:

copy a bundled .bin asset into an app-writable directory on first run
download the model on demand with WhisperController.downloadModel()

The package supports both approaches.

final controller = WhisperController();

await controller.downloadModel(WhisperModel.base);
final modelPath = await controller.getPath(WhisperModel.base);
print(modelPath);

Notes:

downloadModel() uses the official ggerganov/whisper.cpp GGML model URLs defined by WhisperModel.
getPath() is useful when you want to copy a bundled .bin asset into the app's model directory before transcribing.
CoreML .mlmodelc is optional and separate from the .bin model file.

Basic usage

Once the model file exists, the basic transcription flow is intentionally small. Most apps can start from this and then add VAD, timestamps, or conversion only when needed.

import 'package:whisper_ggml_plus/whisper_ggml_plus.dart';

final controller = WhisperController();

final result = await controller.transcribe(
  model: WhisperModel.largeV3Turbo,
  audioPath: audioPath,
  lang: 'auto',
  withTimestamps: true,
  threads: 6,
  vadMode: WhisperVadMode.auto,
);

if (result != null) {
  print(result.transcription.text);
}

Voice activity detection (VAD)

whisper_ggml_plus exposes VAD policy through WhisperVadMode.

This is useful when you want a more explicit tradeoff between silence trimming and timestamp behavior, instead of relying on one fixed package default.

final result = await controller.transcribe(
  model: WhisperModel.base,
  audioPath: audioPath,
  vadMode: WhisperVadMode.auto,
);

WhisperVadMode.auto: automatically uses the bundled Silero VAD model when available.
WhisperVadMode.disabled: always turns VAD off.
WhisperVadMode.enabled: forces VAD on and uses the bundled model by default unless you override it with vadModelPath.

For lower-level control, pass vadMode and vadModelPath directly through TranscribeRequest.

final controller = WhisperController();
await controller.downloadModel(WhisperModel.base);

final response = await Whisper(model: WhisperModel.base).transcribe(
  transcribeRequest: TranscribeRequest(
    audio: audioPath,
    vadMode: WhisperVadMode.enabled,
    vadModelPath: '/absolute/path/to/ggml-silero-v6.2.0.bin',
  ),
  modelPath: await controller.getPath(WhisperModel.base),
);

Word-level timestamps and `splitOnWord`

splitOnWord is a timestamp-sensitive mode. This package disables VAD automatically when splitOnWord is enabled so that word-level timestamps stay more stable and predictable.

That behavior is worth calling out because users often expect VAD and word-level timestamps to compose cleanly, but in practice VAD can make token-level timing harder to reason about.

final controller = WhisperController();
await controller.downloadModel(WhisperModel.base);

final response = await Whisper(model: WhisperModel.base).transcribe(
  transcribeRequest: TranscribeRequest(
    audio: audioPath,
    splitOnWord: true,
    vadMode: WhisperVadMode.auto,
  ),
  modelPath: await controller.getPath(WhisperModel.base),
);

Notes:

splitOnWord: true uses token-level timestamps.
VAD is forced off for this mode even if vadMode is auto or enabled.
If you want stronger silence trimming, keep splitOnWord off and use segment-level timestamps instead.

Batch transcription, abort, and dispose

whisper_ggml_plus currently supports file-based batch transcription from audioPath.

That means the package fits very well when your app already saves audio to disk first, but it is not yet a streaming speech-to-text API.

transcribe(...) processes completed audio files.
Partial streaming transcription while recording is not exposed in the current API.
abort() can stop an in-flight batch transcription.
dispose() releases native resources for the active model context.

final whisper = Whisper(model: WhisperModel.base);

await whisper.abort();
await whisper.dispose();

Example app

The example app in /example shows one complete Flutter flow on top of the package. It is intentionally more detailed than the main README and is the best place to look if you want to see how recording, model setup, and sample-file transcription fit together in one app.

record WAV audio in the app with record
transcribe the bundled jfk.wav sample file
copy a GGML model asset or fall back to downloadModel()
run on Android, iOS, macOS, and Windows

See example/README.md for the demo app instructions.

Performance tips

Performance depends heavily on model size, quantization, platform, and whether you are testing in debug or release mode. The tips below are the most important defaults for practical Flutter usage.

Test performance in --release mode.
Prefer quantized models such as q5_0 or q3_k to reduce memory use.
Use WhisperModel.base or WhisperModel.small for more practical mobile defaults.
Use WhisperModel.largeV3Turbo when you want the best accuracy and can afford the memory and runtime cost.

Optional CoreML acceleration on iOS and macOS

This section is only for Apple-platform acceleration. It is not required for Android, Linux, Windows, or standard CPU/GPU usage on Apple platforms.

If you are just trying to get the package running for the first time, you can skip this section and come back later. CoreML is an optimization path, not a requirement for basic package usage.

What is `.mlmodelc`?

.mlmodelc is a compiled CoreML model directory, not a single file. A typical directory contains:

model.mil
coremldata.bin
metadata.json

Important:

.mlmodelc must remain a directory.
Flutter assets cannot bundle it correctly.
It must live next to the GGML .bin model file with a matching base name.

1. Generate a CoreML encoder

git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp

python3.11 -m venv venv
source venv/bin/activate

pip install torch==2.5.0 "numpy<2.0" coremltools==8.1 openai-whisper ane_transformers

./models/generate-coreml-model.sh large-v3-turbo

Example output:

models/ggml-large-v3-turbo-encoder.mlmodelc/

2. Deploy the CoreML model

Option A: download at runtime

import 'dart:io';

import 'package:path_provider/path_provider.dart';

Future<String> prepareModelDir() async {
  final appSupport = await getApplicationSupportDirectory();
  final modelDir = Directory('${appSupport.path}/models');
  await modelDir.create(recursive: true);

  final coremlDir =
      Directory('${modelDir.path}/ggml-large-v3-turbo-encoder.mlmodelc');

  if (!await coremlDir.exists()) {
    await coremlDir.create(recursive: true);
    // Download each file in the .mlmodelc directory from your own storage.
  }

  return modelDir.path;
}

Option B: add it as an Xcode folder reference

Open ios/Runner.xcworkspace or the macOS runner in Xcode.
Drag the .mlmodelc folder into the project.
Choose Create folder references, not Create groups.
Add it to the correct target.

3. Keep the naming and placement consistent

/app/support/models/
├── ggml-large-v3-turbo-q3_k.bin
└── ggml-large-v3-turbo-encoder.mlmodelc/
    ├── model.mil
    ├── coremldata.bin
    └── metadata.json

Naming convention:

GGML model: ggml-{model-name}-{quantization}.bin
CoreML model: ggml-{model-name}-encoder.mlmodelc/

Examples:

ggml-large-v3-turbo-q3_k.bin + ggml-large-v3-turbo-encoder.mlmodelc/
ggml-base-q5_0.bin + ggml-base-encoder.mlmodelc/

4. Use the normal transcription API

final result = await WhisperController().transcribe(
  model: WhisperModel.largeV3Turbo,
  audioPath: audioPath,
);

Whisper.cpp automatically looks for the matching -encoder.mlmodelc directory next to the .bin model and uses it when available.

Troubleshooting CoreML

Common causes of CoreML not loading:

The .mlmodelc path is wrong.
The .mlmodelc item is a file instead of a directory.
The directory is not next to the .bin model.
The base names do not match.
The model was bundled through Flutter assets instead of runtime storage or Xcode folder references.

License

MIT License - Based on the original work by sk3llo/whisper_ggml.

Whisper GGML Plus

Why use `whisper_ggml_plus`

Supported platforms

What this package does

What this package does not do yet

Quick start

Install the core package

If your audio is already 16 kHz mono WAV

If your audio is MP3, M4A, MP4, or another format

Model setup

Basic usage

Voice activity detection (VAD)

Word-level timestamps and `splitOnWord`

Batch transcription, abort, and dispose

Example app

Performance tips

Optional CoreML acceleration on iOS and macOS

What is `.mlmodelc`?

1. Generate a CoreML encoder

2. Deploy the CoreML model

Option A: download at runtime

Option B: add it as an Xcode folder reference

3. Keep the naming and placement consistent

4. Use the normal transcription API

Troubleshooting CoreML

License

Libraries

whisper_ggml_plus package

Whisper GGML Plus

Why use whisper_ggml_plus

Supported platforms

What this package does

What this package does not do yet

Quick start

Install the core package

If your audio is already 16 kHz mono WAV

If your audio is MP3, M4A, MP4, or another format

Model setup

Basic usage

Voice activity detection (VAD)

Word-level timestamps and splitOnWord

Batch transcription, abort, and dispose

Example app

Performance tips

Optional CoreML acceleration on iOS and macOS

What is .mlmodelc?

1. Generate a CoreML encoder

2. Deploy the CoreML model

Option A: download at runtime

Option B: add it as an Xcode folder reference

3. Keep the naming and placement consistent

4. Use the normal transcription API

Troubleshooting CoreML

License

Libraries

whisper_ggml_plus package

Why use `whisper_ggml_plus`

Word-level timestamps and `splitOnWord`

What is `.mlmodelc`?