Tesseract OCR for Flutter
A Flutter plugin that provides Optical Character Recognition (OCR) capabilities using Tesseract (v4.x) and Apple Vision (iOS).
This plugin utilizes:
- Android: Tesseract4Android
- iOS: SwiftyTesseract (with support for v4.0.1 via custom CocoaPods) and Apple's Vision framework.
Features
- Perform OCR on images to extract text.
- Support for multiple OCR engines: Tesseract (iOS/Android), Apple Vision (iOS).
- Configurable OCR options (language, engine mode, page segmentation mode, etc.) using
OCRConfig
. - Supports the latest Dart and Android SDKs.
- Includes a custom CocoaPods setup to enable using SwiftyTesseract 4.0.1 on iOS.
Getting Started
1. Add Dependency
Add tesseract_ocr
to your pubspec.yaml
:
dependencies:
tesseract_ocr: ^<latest_version> # Replace <latest_version> with the current version
Then run flutter pub get
.
2. Add Trained Data (Required for Tesseract Engine)
For the Tesseract engine to work, you need to include language trained data files (.traineddata
) and a configuration file (tessdata_config.json
) in your Flutter app's assets.
-
Create
assets
andassets/tessdata
folders in the root of your Flutter project. -
Download Trained Data: Get the necessary
.traineddata
files for the languages you need from the Tesseracttessdata
,tessdata_best
, ortessdata_fast
repositories. Place them inside yourassets/tessdata
folder. For English, you'll typically needeng.traineddata
. -
Create
tessdata_config.json
: Create a file namedtessdata_config.json
directly inside yourassets
folder (not inassets/tessdata
). This file should list all the.traineddata
files present in yourassets/tessdata
folder.Example
assets/tessdata_config.json
:{ "files": [ "eng.traineddata", "fas.traineddata", "urd.traineddata" // Add all other .traineddata filenames here ] }
-
Declare Assets in
pubspec.yaml
: Add yourassets
andassets/tessdata
directories to theassets
section of yourpubspec.yaml
:flutter: assets: - assets/tessdata_config.json - assets/tessdata/
Run
flutter pub get
again.
The plugin will automatically copy these trained data files to the application's documents directory on the first run if they are not already present.
Note on Asset Loading Issues: If you encounter asset loading errors or "Data path must not be null!" errors, ensure your setup follows these exact requirements:
-
Directory Structure:
your_project/ ├── assets/ │ ├── tessdata/ │ │ ├── eng.traineddata │ │ └── [other language files] │ └── tessdata_config.json └── pubspec.yaml
-
pubspec.yaml assets section:
flutter: assets: - assets/ - assets/tessdata/
-
tessdata_config.json content:
{ "files": [ "eng.traineddata" ] }
Troubleshooting:
- Ensure
.traineddata
files are inassets/tessdata/
directory - Verify all language files listed in
tessdata_config.json
actually exist - Try adding individual file entries in pubspec.yaml if assets fail to load:
assets: - assets/tessdata_config.json - assets/tessdata/eng.traineddata
Common Issues and Solutions
"Data path must not be null!" Error
This error typically occurs when:
- The tessdata files are not properly loaded from assets
- The
tessdata_config.json
file is missing or incorrectly formatted - Asset paths in
pubspec.yaml
are not correctly configured
Solution: Follow the exact directory structure and configuration shown above.
"Unable to load asset" Error
This happens when Flutter cannot find the specified asset files.
Solution:
- Verify file names match exactly between
tessdata_config.json
and actual files - Ensure proper asset declarations in
pubspec.yaml
- Try running
flutter clean
andflutter pub get
Path Duplication Issues (M1 Macs)
Some users on M1 Macs may see paths like assets/tessdata/assets/tessdata/file.traineddata
.
Solution: The plugin automatically handles this, but ensure you're using the latest version.
iOS Specific Setup (SwiftyTesseract 4.0.1 via Custom CocoaPods)
SwiftyTesseract 4.0.x uses Swift Package Manager and removed CocoaPods support. To allow this plugin to use SwiftyTesseract 4.0.1 via CocoaPods, a custom setup is provided.
Important: This requires manual steps in your iOS project.
-
Download
libtesseract.xcframework
: SwiftyTesseract 4.0.1 depends onlibtesseract
version 0.2.0, which is also not available on CocoaPods. You need to download the pre-built binary framework. Getlibtesseract.xcframework.zip
for version0.2.0
from the libtesseract GitHub Releases page. -
Extract and Place
libtesseract.xcframework
: Unzip the downloaded file. Place the resultinglibtesseract.xcframework
folder directly into your Flutter app'sios
directory (the same directory where your mainPodfile
is located).Your app's
ios
directory structure should look like this:your_flutter_app/ios/ ├── Runner.xcodeproj ├── Runner.xcworkspace ├── Podfile <-- Your main Podfile ├── libtesseract.xcframework <-- Place the extracted folder here └── ... (other iOS files)
-
Configure Your App's Podfile: In your Flutter app's
ios/Podfile
, you need to reference the custom podspecs provided within thetesseract_ocr
plugin using the:path
option.Locate your
target 'Runner'
block and add the following lines:# In your main Flutter app's Podfile (e.g., your_flutter_app/ios/Podfile) # ... other Podfile content ... target 'Runner' do use_frameworks! use_modular_headers! # This line should already be here if you added the plugin via pubspec.yaml pod 'tesseract_ocr', :path => '../.symlinks/plugins/tesseract_ocr/' # Add the custom SwiftyTesseract401 and libtesseract pods # These paths are relative from YOUR app's ios directory to the plugin's ios directory # Adjust the '../../.symlinks/plugins/tesseract_ocr/ios' path if necessary pod 'SwiftyTesseract401', :path => '../../.symlinks/plugins/tesseract_ocr/ios' pod 'libtesseract', :path => '../../.symlinks/plugins/tesseract_ocr/ios' # flutter_install_all_ios_pods File.dirname(File.realpath(__FILE__)) # Keep this line end # ... other Podfile content (e.g., post_install hook) ...
Note: The
:path
value../../.symlinks/plugins/tesseract_ocr/ios
is the standard path from your app'sios
directory to a plugin'sios
directory when using Flutter. If your project structure is different, you may need to adjust this path. -
Install Pods: Navigate to your
ios
directory in the terminal and runpod install
:cd your_flutter_app/ios pod install
Android SDK Support
The plugin is configured to compile against the latest Android SDK (API 35) and has a minimum SDK version of 16, providing broad compatibility.
Usage
Import the package:
import 'package:tesseract_ocr/tesseract_ocr.dart';
import 'package:tesseract_ocr/ocr_engine_config.dart';
To perform OCR, use the TesseractOcr.extractText
method. You can pass an optional OCRConfig
object to specify the engine and other options.
Future<void> _performOcr(String imagePath) async {
try {
// Default usage (uses OCREngine.defaultEngine, language 'eng')
// String extractedText = await TesseractOcr.extractText(imagePath);
// Example: Using Tesseract engine with a specific language
final tesseractConfig = OCRConfig(
language: 'eng', // Must match a .traineddata file in assets/tessdata
engine: OCREngine.tesseract,
// Optional Tesseract options:
// options: {
// TesseractConfig.preserveInterwordSpaces: '1',
// TesseractConfig.pageSegMode: PageSegmentationMode.autoOsd,
// TesseractConfig.debugFile: '/path/to/debug.log', // Example option
// },
);
String extractedTextTesseract = await TesseractOcr.extractText(
imagePath,
config: tesseractConfig,
);
print('Extracted Text (Tesseract): $extractedTextTesseract');
// Example: Using Apple Vision engine (iOS only)
final visionConfig = OCRConfig(
engine: OCREngine.vision,
language: 'eng', // Vision engine may also use language hint
);
// Check if running on iOS before using Vision
if (Platform.isIOS) {
String extractedTextVision = await TesseractOcr.extractText(
imagePath,
config: visionConfig,
);
print('Extracted Text (Vision): $extractedTextVision');
}
} catch (e) {
print('Error performing OCR: $e');
}
}
OCRConfig
The OCRConfig
class allows detailed configuration of the OCR process:
language
(String): The language code (e.g., 'eng', 'fas'). Required for the Tesseract engine, usually corresponds to the.traineddata
file prefix. Can be used as a hint for the Vision engine. Defaults to 'eng'.engine
(OCREngine
): The OCR engine to use (OCREngine.vision
,OCREngine.tesseract
, orOCREngine.defaultEngine
). Defaults toOCREngine.defaultEngine
(Vision on iOS, Tesseract on Android).tessDataPath
(String?): Internal Use. The plugin automatically handles loading tessdata from assets; you typically don't need to set this.options
(Map<String, dynamic>?): A map of additional configuration options.- For the Tesseract engine, these are passed directly to the Tesseract API. Use keys from
TesseractConfig
or any valid Tesseract configuration variable name (e.g.,'preserve_interword_spaces'
,'tessedit_pageseg_mode'
). Values should be strings. - For the Vision engine (iOS), limited options might be supported depending on the native implementation (currently, primarily language hint via
language
).
- For the Tesseract engine, these are passed directly to the Tesseract API. Use keys from
See ocr_engine_config.dart
for the OCREngine
, TesseractConfig
, and PageSegmentationMode
enums and classes.
Example
Check the example
directory for a complete Flutter app demonstrating how to use the plugin, select images, choose the OCR engine, and display results.
Libraries
- ocr_engine_config
- OCR Engine configuration for tesseract_ocr plugin.
- tesseract_ocr
- tesseract_ocr_method_channel
- tesseract_ocr_platform_interface