Google Vision Images REST API Client

Native Dart package that integrates Google Vision features, including image labeling, face, logo, and landmark detection, optical character recognition (OCR), and detection of explicit content, into applications.

Project Status

pub package

Build Status github last commit github build github issues

Please feel free to submit PRs for any additional helper methods, or report an issue for a missing helper method and I'll add it if I have time available.

Recent Changes

New for v1.2.0

  • helper methods that simplify any single detection so a simple face detection can be performed with the faceDetection(JsonImage jsonImage) method, see the table below.

New for v1.0.8

New for v1.0.7

JLuisRojas has provided code for:

  • detect text in images
  • detect handwriting in images

In addition support for the following has also been added:

  • detect crop hints
  • detect image properties
  • detect landmarks
  • detect logos

Getting Started


To use this package, add the dependency to your pubspec.yaml file:

  google_vision: ^1.2.1+2

Obtaining Authorization Credentials

Authenticating to the Cloud Vision API requires a JSON file with the JWT token information, which you can obtain by creating a service account in the API console.

Usage of the Cloud Vision API

final googleVision =
    await GoogleVision.withJwtFile('service_credentials.json');


final faceAnnotationResponses = await googleVision.faceDetection(

for (var faceAnnotation in faceAnnotationResponses) {
  print('Face - ${faceAnnotation.detectionConfidence}');

  print('Joy - ${faceAnnotation.enumJoyLikelihood}');


New Helper Methods

Method Signature
Future<AnnotateImageResponse> detection(
  JsonImage jsonImage,
  AnnotationType annotationType,
  {int maxResults = 10,}
Lower level method for a single detection type as specified by annotationType
Future<CropHintsAnnotation?> cropHints(
  JsonImage jsonImage,
  {int maxResults = 10,}
Crop Hints suggests vertices for a crop region on an image.
Future<FullTextAnnotation?> documentTextDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Extracts text from an image (or file); the response is optimized for dense text and documents. The break information. A specific use of documentTextDetection is to detect handwriting in an image.
Future<List<FaceAnnotation>> faceDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Face Detection detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear.
Future<ImagePropertiesAnnotation?> imageProperties(
  JsonImage jsonImage,
  {int maxResults = 10,}
The Image Properties feature detects general attributes of the image, such as dominant color.
Future<List<EntityAnnotation>> labelDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Labels can identify general objects, locations, activities, animal species, products, and more. Labels are returned in English only.
Future<List<EntityAnnotation>> landmarkDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Landmark Detection detects popular natural and human-made structures within an image.
Future<List<EntityAnnotation>> logoDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Logo Detection detects popular product logos within an image.
Future<List<LocalizedObjectAnnotation>> objectLocalization(
  JsonImage jsonImage,
  {int maxResults = 10,}
The Vision API can detect and extract multiple objects in an image with Object Localization. Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object. Object localization identifies both significant and less-prominent objects in an image.
Future<SafeSearchAnnotation?> safeSearchDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
SafeSearch Detection detects explicit content such as adult content or violent content within an image. This feature uses five categories (adult, spoof, medical, violence, and racy) and returns the likelihood that each is present in a given image. See the SafeSearchAnnotation page for details on these fields.
Future<List<EntityAnnotation>> textDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes.
Future<WebDetection?> webDetection(
  JsonImage jsonImage,
  {int maxResults = 10,}
Web Detection detects Web references to an image.

Usage with Flutter

For a quick intro into the use of Google Vision in a Flutter, take a look at the google_vision_flutter package and the example folder of the project's GitHub repository.

If Flutter specific Google Vision Widget doesn't suite your requirements, then to work with Flutter it's usually necessary to convert an object that is presented as an Asset or a Stream into a File for use by this google_vision package. This StackOverflow post gives an idea on how this can be accomplished. A similar process can be used for any Stream of data that represents an image supported by google_vision. Essentially, the Google Vision REST API needs to be able to convert the image data into its Base64 representation before submitting it to the Google server and having the bytedata available in the code makes this easier.

Vision cli (google_vision at the command prompt)

This package included a cli utility that can be used to return data for any API call currently supported by the package. If you want to get started quickly with the cli utility run these commands in a terminal session:

Install using dart pub:

dart pub global activate google_vision

Run the following command to see help:

vision --help


A command line interface for making API requests to the Google Vision.

Usage: vision <command> [arguments]

Global options:
-h, --help                                       Print this usage information.
    --credential-file=<credentials file path>    (defaults to "/Users/chris/.vision/credentials.json")

Available commands:
  crop_hints  Set of crop hints that are used to generate new crops when serving images.
  detect      Run image detection and annotation for an images.
  highlight   Draw a box to highlight any objects detected.
  safe_search SafeSearch Detection detects explicit content such as adult content or violent content within an image.
  score       For OBJECT_LOCALIZATION, get the score(s) for the object specified with "look-for".

Please see the cli documentation for more detailed usage information.


Cloud Vision API - Documentation Reference



Any help from the open-source community is always welcome and needed:

  • Found an issue?
    • Please fill a bug report with details.
  • Need a feature?
    • Open a feature request with use cases.
  • Are you using and liking the project?
    • Promote the project: create an article or post about it
    • Make a donation
  • Do you have a project that uses this package
    • let's cross promote, let me know and I'll add a link to your project
  • Are you a developer?
    • Fix a bug and send a pull request.
    • Implement a new feature.
    • Improve the Unit Tests.
  • Have you already helped in any way?
    • Many thanks from me, the contributors and everybody that uses this project!

If you donate 1 hour of your time, you can contribute a lot, because others will do the same, just be part and start with your 1 hour.


Integrates Google Vision features, including image labeling, face, logo, and landmark detection, optical character recognition (OCR), and detection of explicit content, into applications.
app metadata