text_sight 0.1.0
text_sight: ^0.1.0 copied to clipboard
Live, on-device text recognition — Apple Vision on iOS, ML Kit on Android. The text-scanning sibling to mobile_scanner.
Live, on-device text recognition for Flutter — Apple Vision on iOS, ML Kit on Android. Like
mobile_scanner, but for text instead of barcodes.
- Why text_sight?
- A quick taste
- Platform support
- Install
- The recognition model
- Performance
- Going deeper
Why text_sight? #
Most cross-platform OCR plugins run Google ML Kit on both platforms. That quietly pulls
GoogleMLKit into your iOS build — and with it the arm64 and Swift Package Manager warnings
that have been nagging Flutter iOS builds for a while.
text_sight takes the other road. On iOS it uses Apple Vision, a system framework, so your app
links zero third-party ML libraries there — no GoogleMLKit, no warnings. Android keeps ML Kit,
declared only in its own Gradle file. Nothing recognition-related ever reaches your pubspec.yaml,
so the two platforms can't bleed into each other. Clean, native text scanning on both. That's the
whole idea.
A quick taste #
Point the camera at some text:
final controller = TextSightController();
TextSightView(
controller: controller,
onResult: (capture) => capture.lines.forEach((line) => print(line.text)),
overlayBuilder: (context, capture, constraints) => /* paint line.boundingBox */,
);
await controller.start(); // after the camera permission is granted
Or read a single still — no camera, no permission:
final capture = await TextSight.recognizeImage(bytes); // or .recognizePath('/photo.jpg')
Either way, boxes come back normalized [0, 1] from the top-left, identical on both platforms, so
your overlay never has to know which engine drew them.
Want a scan-box? Hand the controller a region of interest —
TextSightController(options: TextSightOptions(roi: Rect.fromLTWH(0.1, 0.4, 0.8, 0.2))) — or change
it, the recognition level, or the torch while the session runs. It applies to the live preview and
the one-shot alike.
One Android thing worth knowing up front: the model downloads on first use, so give it a head start when the user opens your scanner — otherwise that first scan comes back empty.
The example/ app is where to look next — a live overlay, torch, region-of-interest,
permission handling, and the one-shot screen, all wired up and ready to crib from.
![]() Android · ML Kit |
![]() iOS · Apple Vision |
Platform support #
| Platform | Minimum | Engine |
|---|---|---|
| iOS | 18.0 | Apple Vision — RecognizeTextRequest |
| Android | API 24 | ML Kit Text Recognition v2 (Latin) |
A few things worth knowing before you start: iOS needs 18.0+ (older versions are on the roadmap), Android recognizes Latin script only for now, and live scanning needs a real device — the iOS Simulator has no camera. The one-shot runs anywhere.
Install #
flutter pub add text_sight
On iOS, add a camera-usage string to ios/Runner/Info.plist:
<key>NSCameraUsageDescription</key>
<string>Used to recognize text from the camera.</string>
text_sight won't request camera permission for you — ask for it (e.g. with
permission_handler), then call
controller.start(). Android's manifest already has what it needs.
The recognition model #
On iOS there's nothing to see here — recognition is Apple Vision, a system framework that's always on hand. No download, no waiting.
Android is the interesting one. The ML Kit model ships unbundled by default: it's a tiny ~260 KB and gets pulled from Google Play Services the first time you actually use it. We don't grab it at install time on purpose — most apps don't need OCR the second they launch, so there's no point making everyone pay for it up front. The one catch: a scan you kick off before the model has landed comes back empty.
So give it a nudge when the user wanders into your scanner:
final state = await TextSightModel.ensureReady();
if (state is ModelUnavailable) {
// No Play Services, or the download didn't make it. Tell the user, maybe offer a retry.
}
Call it as often as you like — it returns right away once the model's around (which is always, on iOS). Want a progress bar in front of the user while it downloads? Listen to the readiness stream and switch over it. It's a sealed type, so the compiler makes sure you've handled every case:
TextSightModel.readiness.listen((state) {
final label = switch (state) {
ModelReady() => 'Ready to scan',
ModelDownloading(:final progress) => 'Downloading… ${((progress ?? 0) * 100).round()}%',
ModelUnavailable(:final reason) => 'Model unavailable ($reason)',
};
// ...show `label`, or feed `progress` straight into a progress indicator
});
The example/ live scanner does exactly this — ensureReady() to gate, the stream for a
real download bar.
Or just bundle it #
Don't fancy any of that? Ship the model inside your APK — instant, offline, Play Services out of the
picture. One line in your app's android/gradle.properties:
com.lahaluhem.text_sight.useBundled=true
Now ensureReady() returns immediately and ModelUnavailable never shows up. You're trading size
for it, mind:
| Mode | App size | First use | Offline | Needs Play Services |
|---|---|---|---|---|
| Unbundled (default) | ~260 KB | downloads on demand | after first download | yes |
| Bundled | ~4 MB/script/arch | instant | yes | no |
Performance #
Recognition results cross from native to Dart as a small per-frame map over an EventChannel.
Decoding it on the UI isolate costs microseconds — even a dense ~127-line frame is ~55 µs, well
under 1% of a 60 fps budget. The native engine's inference, not the transport, sets the pace.

These measure the pure-Dart codec only — not native inference or end-to-end latency, which dominate.
Leaner transports (list, Pigeon, packed-binary) win big in percent but stay tiny in absolute µs,
so the self-describing Map stays. Full methodology and numbers: benchmark/.
Going deeper #
How it all fits together — coordinate handling, the per-line confidence contract, how region-of-interest differs across platforms, and what's next — lives in APPENDIX.md.


