flutter_ocr_kit 1.0.0 copy "flutter_ocr_kit: ^1.0.0" to clipboard
flutter_ocr_kit: ^1.0.0 copied to clipboard

A Flutter FFI plugin for OCR with Edge AI. Uses Apple Vision (iOS) / ML Kit (Android) for text recognition and ONNX Runtime for layout detection.

Flutter OCR Kit #

A Flutter FFI plugin for OCR (Optical Character Recognition) with Edge AI support. Runs AI inference directly on mobile devices using ONNX Runtime and native OCR engines.

Screenshots #

Real-time Text Search Invoice Scanner Quotation Scanner

Demo Video #

Demo Video

The demo includes 4 examples:

  1. Real-time text search - Find specific text strings in camera view
  2. Real-time KIE - Extract specific types (dates, phone numbers, amounts)
  3. Invoice scanner - Scan Taiwan e-invoices and extract invoice number, date, amount
  4. Quotation scanner - Scan custom delivery notes with Layout Detection + OCR to extract items, prices, totals

Features #

  • Native OCR Engine: Uses Apple Vision (iOS) and Google ML Kit (Android) for text recognition
  • Layout Detection: ONNX-based document layout analysis (PP-Layout model) to identify tables, text blocks, titles, and figures
  • Edge AI: All processing runs locally on device - no internet required
  • Cross-platform: Supports both iOS and Android

Supported Platforms #

Platform OCR Engine Layout Detection Native Library
iOS Apple Vision ONNX Runtime + OpenCV Static (.a)
Android Google ML Kit ONNX Runtime + OpenCV Dynamic (.so)

Installation #

1. Add Dependency #

dependencies:
  flutter_ocr_kit:
    git:
      url: https://github.com/robert008/flutter_ocr_kit.git

2. Download AI Model #

Download the ONNX model from GitHub Releases:

Model Size Description
pp_doclayout_l.onnx 123 MB Layout detection model

Steps:

  1. Create assets/ folder in your project root
  2. Download the model file and place it in assets/
  3. Register in pubspec.yaml:
flutter:
  assets:
    - assets/pp_doclayout_l.onnx

3. Platform Setup #

iOS: Run pod install in your iOS directory. The native libraries will be downloaded automatically.

Android: The native libraries are bundled with the package.

Quick Start #

Basic OCR #

import 'package:flutter_ocr_kit/flutter_ocr_kit.dart';

// Recognize text from image file
final result = await OcrKit.recognizeNative('/path/to/image.jpg');

print('Full text: ${result.fullText}');
for (final line in result.textLines) {
  print('${line.text} (confidence: ${line.score})');
}

Layout Detection #

// Initialize layout model
OcrKit.init('/path/to/pp_doclayout_l.onnx');

// Detect document layout
final layout = OcrKit.detectLayout('/path/to/document.jpg');

for (final region in layout.detections) {
  print('${region.className}: (${region.x1}, ${region.y1}) - (${region.x2}, ${region.y2})');
}

// Release model when done to free memory
OcrKit.releaseLayout();

Note (iOS): Layout detection uses Core ML Execution Provider for faster inference, but consumes more memory (~1.5GB). Always call OcrKit.releaseLayout() when you no longer need layout detection to free memory. To disable Core ML and use CPU only (lower memory, slower speed), modify src/detect/doc_detector.cpp and rebuild the static library.

Supported layout classes: Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation

Combined: Layout + OCR #

// Step 1: Detect layout to find table regions
final layout = OcrKit.detectLayout(imagePath);
final tableRegions = layout.detections.where((d) => d.className == 'Table');

// Step 2: Run OCR on the entire image
final ocrResult = await OcrKit.recognizeNative(imagePath);

// Step 3: Filter OCR results within table regions
for (final table in tableRegions) {
  final tableTexts = ocrResult.textLines.where((line) {
    // Check if text is within table bounding box
    return line.rect.overlaps(Rect.fromLTRB(table.x1, table.y1, table.x2, table.y2));
  });
  print('Table content: ${tableTexts.map((t) => t.text).join(' ')}');
}

Example App #

The example app includes 4 tabs demonstrating different use cases:

Tab 1: OCR #

Basic OCR demonstration:

  • Pick image from gallery or capture with camera
  • Display recognized text with bounding boxes
  • Show confidence scores for each text line

Tab 2: KIE (Key Information Extraction) #

Simple regex-based entity extraction:

  • Extract dates, amounts, phone numbers from OCR results
  • Demonstrates how to post-process OCR output

Tab 3: Invoice Scanner #

Taiwan e-invoice scanner demo:

  • Real-time camera scanning
  • Extracts invoice number (XX-12345678 format)
  • Extracts amount and period
  • Auto-deduplication by invoice number

Important: This is a specialized demo for Taiwan e-invoice format. It demonstrates how to combine real-time OCR with custom extraction logic. You will need to modify the extraction rules (invoice_extractor.dart) for your own document format.

Tab 4: Quotation Scanner #

Quotation/delivery note scanner demo:

  • Uses Layout Detection to find Table regions
  • Runs OCR within detected regions
  • Extracts quotation number, date, customer, items, and totals
  • Supports both photo mode and real-time camera mode

Important: This is a specialized demo for a specific quotation format. It demonstrates how to combine Layout Detection + OCR for structured document extraction. The extraction logic (quotation_extractor.dart) is tailored for the demo documents and will need customization for your own document format.

Demo Files #

Editable demo files are provided for testing and customization:

example/assets/demo/
  invoices/              # Sample Taiwan e-invoice images
    invoice_1.jpg
    invoice_2.jpg
    invoice_3.jpg
  quotations/            # Sample quotation PDFs (editable)
    宏達科技_出貨單_HD-2024120001.pdf
    宏達科技_出貨單_HD-2024120015.pdf

You can modify the PDF files to test with your own data, then convert to images for scanning.

How to Build Your Own Document Scanner #

The Invoice and Quotation demos show the pattern for building custom document scanners:

  1. Define your extraction rules - Create an extractor class (see invoice_extractor.dart or quotation_extractor.dart)

  2. Use regex patterns - Define patterns for the fields you want to extract:

// Example: Extract order number like "ORD-2024-001234"
final orderPattern = RegExp(r'ORD-\d{4}-\d{6}');
final match = orderPattern.firstMatch(ocrResult.fullText);
  1. Use Layout Detection (optional) - For structured documents with tables:
// Find table regions first
final tables = layout.detections.where((d) => d.className == 'Table');
// Then extract data from table area only
  1. Handle confidence scores - Filter low-confidence results:
final reliableText = ocrResult.textLines.where((line) => line.score > 0.8);

Project Structure #

lib/
  flutter_ocr_kit.dart              # Main API (OcrKit class)
  src/
    models.dart                     # Data models (TextLine, OcrResult, LayoutResult)
    ocr_service.dart                # Async OCR service with isolate support
    invoice_extractor.dart          # Taiwan e-invoice extraction (demo)
    quotation_extractor.dart        # Quotation extraction (demo)

src/                                # Native C++ code (FFI)
  native_lib.cpp                    # FFI exported functions
  detect/
    doc_detector.cpp                # Layout detection with ONNX
  ocr/
    ocr_engine.cpp                  # OCR engine (backup, not used by default)

ios/
  Classes/
    OcrKitPlugin.m                  # iOS plugin entry
    VisionOcr.m                     # Apple Vision OCR implementation
  static_libs/                      # Pre-built static libraries (.a)
  Frameworks/                       # ONNX Runtime & OpenCV frameworks

android/
  src/main/
    kotlin/.../OcrKitPlugin.kt      # Android plugin (ML Kit OCR)
    jniLibs/                        # Pre-built dynamic libraries (.so)
    cpp/include/                    # ONNX Runtime & OpenCV headers

API Reference #

OcrKit #

Method Description
init(modelPath) Initialize ONNX layout model
detectLayout(imagePath) Detect document layout regions
recognizeNative(imagePath) OCR using native engine (Vision/ML Kit)
recognizeFromFile(imagePath) OCR using ONNX model (backup)

OcrResult #

Property Type Description
fullText String Concatenated text from all lines
textLines List<TextLine> Individual text lines with positions
imageWidth int Source image width
imageHeight int Source image height

TextLine #

Property Type Description
text String Recognized text content
score double Confidence score (0.0 - 1.0)
rect Rect Bounding box position
wordBoxes List<Rect> Word-level bounding boxes

LayoutResult #

Property Type Description
detections List<LayoutDetection> Detected regions
count int Number of detected regions

LayoutDetection #

Property Type Description
className String Region type (Table, Text, Title, Figure, etc.)
x1, y1, x2, y2 double Bounding box coordinates
score double Detection confidence

Building from Source #

Prerequisites #

  • Flutter SDK 3.7+
  • Xcode 14+ (for iOS)
  • Android NDK (for Android)

Build Commands #

# Run example app
cd example && flutter run

# Analyze code
flutter analyze

# Build Android native library (.so)
./scripts/build_android_so.sh

# Build iOS static library (.a)
./scripts/build_ios_static.sh

License #

MIT License

1
likes
140
points
102
downloads

Publisher

unverified uploader

Weekly Downloads

A Flutter FFI plugin for OCR with Edge AI. Uses Apple Vision (iOS) / ML Kit (Android) for text recognition and ONNX Runtime for layout detection.

Repository (GitHub)
View/report issues

Documentation

API reference

License

unknown (license)

Dependencies

ffi, flutter

More

Packages that depend on flutter_ocr_kit

Packages that implement flutter_ocr_kit