Google Vision Images REST API Client

pub package License: MIT

Native Dart package that integrates Google Vision features, including image labeling, face, logo, and landmark detection, optical character recognition (OCR), and detection of explicit content, into your applications.

If you are looking at integrating the Google Vision API into your Flutter SDK application then you might want to take a look at my related package google_vision_flutter, which provides a widget that wraps the functionality provided by this Dart SDK focussed package.

Project Status

Build Status github last commit github build github issues

Buy me a coffee

Please feel free to submit PRs for any additional helper methods, or report an issue for a missing helper method and I'll add it if I have time available.

Recent Changes

New for v1.4.0

  • A breaking change from the previous version is that the GoogleVision class now follows the Singleton design pattern. Now the object is instantiated as follows:

// Old method from v1.3.x and earlier
// final googleVision = await GoogleVision.withJwtFile('service_credentials.json');

// New
final googleVision = await GoogleVision().withJwtFile('service_credentials.json');

New for v1.3.0

  • This version of the package supports both the image and file annotation APIs for Google Vision. The previous versions of the package supported only the image API.
  • A number of methods and classes have been Deprecated in this version. All the provided examples still work without any changes, so the changes in this package should not cause any issue to existing code.
  • The file functionality added to this release allows for the annotation of file formats that have pages or frames, specifically pdf, tiff and gif. Google Vision allows annotation of up to 5 pages/frames.

New for v1.2.0

  • helper methods that simplify any single detection so a simple face detection can be performed with the faceDetection(JsonImage jsonImage) method, see the table below.

Getting Started

pubspec.yaml

To use this package, add the dependency to your pubspec.yaml file:

dependencies:
  ...
  google_vision: ^1.4.0+1

Obtaining Authentication/Authorization Credentials

Authenticating to the Cloud Vision API can be done with one of two methods:

  • The first method requires a JSON file with the JWT token information, which you can obtain by creating a service account in the API console.
  • The second method requires an API key to be created.

Both of the authorization/authentication methods listed above assume that you already have a Google account, you have created a Google Cloud project and you have enabled the Cloud Vision API in the Google API library.

Usage of the Cloud Vision API

final googleVision = await GoogleVision().withApiKey(
  Platform.environment['GOOGLE_VISION_API_KEY'] ?? '[YOUR API KEY]',
  // additionalHeaders: {'com.xxx.xxx': 'X-Ios-Bundle-Identifier'},
);

print('checking...');

final faceAnnotationResponses = await googleVision.image.faceDetection(
    JsonImage.fromGsUri(
        'gs://gvision-demo/young-man-smiling-and-thumbs-up.jpg'));

for (var faceAnnotation in faceAnnotationResponses) {
  print('Face - ${faceAnnotation.detectionConfidence}');

  print('Joy - ${faceAnnotation.enumJoyLikelihood}');
}

// Output:
// Face - 0.9609375
// Joy - Likelihood.UNLIKELY

print('done.');

New Helper Methods

Method Signature
Description
Future<AnnotateImageResponse> detection(
  JsonImage jsonImage,
  AnnotationType annotationType,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Lower level method for a single detection type as specified by annotationType
Future<CropHintsAnnotation?> cropHints(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Crop Hints suggests vertices for a crop region on an image.
Future<FullTextAnnotation?> documentTextDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Extracts text from an image (or file); the response is optimized for dense text and documents. The break information. A specific use of documentTextDetection is to detect handwriting in an image.
Future<List<FaceAnnotation>> faceDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Face Detection detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear.
Future<ImagePropertiesAnnotation?> imageProperties(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
The Image Properties feature detects general attributes of the image, such as dominant color.
Future<List<EntityAnnotation>> labelDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Labels can identify general objects, locations, activities, animal species, products, and more. Labels are returned in English only.
Future<List<EntityAnnotation>> landmarkDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Landmark Detection detects popular natural and human-made structures within an image.
Future<List<EntityAnnotation>> logoDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Logo Detection detects popular product logos within an image.
Future<List<LocalizedObjectAnnotation>> objectLocalization(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
The Vision API can detect and extract multiple objects in an image with Object Localization. Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object. Object localization identifies both significant and less-prominent objects in an image.
Future<SafeSearchAnnotation?> safeSearchDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
SafeSearch Detection detects explicit content such as adult content or violent content within an image. This feature uses five categories (adult, spoof, medical, violence, and racy) and returns the likelihood that each is present in a given image. See the SafeSearchAnnotation page for details on these fields.
Future<List<EntityAnnotation>> textDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes.
Future<WebDetection?> webDetection(
  JsonImage jsonImage,
  {ImageContext? imageContext,
  int maxResults = 10,}
)
Web Detection detects Web references to an image.

Usage with Flutter

For a quick intro into the use of Google Vision in a Flutter, take a look at the google_vision_flutter package and the example folder of the project's GitHub repository.

If Flutter specific Google Vision Widget doesn't meet your requirements, then to work with Flutter it's usually necessary to convert an object that is presented as an Asset or a Stream into a File for use by this google_vision package. This StackOverflow post gives an idea on how this can be accomplished. A similar process can be used for any Stream of data that represents an image supported by google_vision. Essentially, the Google Vision REST API needs to be able to convert the image data into its Base64 representation before submitting it to the Google server and having the bytedata available in the code makes this easier.

Vision cli (google_vision at the command prompt)

This package included a cli utility that can be used to return data for any API call currently supported by the package. If you want to get started quickly with the cli utility run these commands in a terminal session:

Install using dart pub:

dart pub global activate google_vision

Run the following command to see help:

vision --help

Result,

A command line interface for making API requests to the Google Vision.

Usage: vision <command> [arguments]

Global options:
-h, --help                                       Print this usage information.
    --credential-file=<credentials file path>    (defaults to "/Users/chris/.vision/credentials.json")

Available commands:
  crop_hints  Set of crop hints that are used to generate new crops when serving images.
  detect      Run image detection and annotation for an images.
  highlight   Draw a box to highlight any objects detected.
  safe_search SafeSearch Detection detects explicit content such as adult content or violent content within an image.
  score       For OBJECT_LOCALIZATION, get the score(s) for the object specified with "look-for".

Please see the cli documentation README.md for more detailed usage information.

Reference

Cloud Vision API - Documentation Reference

Contributors

Contributing

Any help from the open-source community is always welcome and needed:

  • Found an issue?
    • Please fill a bug report with details.
  • Need a feature?
    • Open a feature request with use cases.
  • Are you using and liking the project?
    • Promote the project: create an article or post about it
    • Make a donation
  • Do you have a project that uses this package
    • let's cross promote, let me know and I'll add a link to your project
  • Are you a developer?
    • Fix a bug and send a pull request.
    • Implement a new feature.
    • Improve the Unit Tests.
  • Have you already helped in any way?
    • Many thanks from me, the contributors and everybody that uses this project!

If you donate 1 hour of your time, you can contribute a lot, because others will do the same, just be part and start with your 1 hour.

Libraries

google_vision
Integrates Google Vision features, including image labeling, face, logo, and landmark detection, optical character recognition (OCR), and detection of explicit content, into applications.
meta
app metadata