google_vision_cli 2.1.0 copy "google_vision_cli: ^2.1.0" to clipboard
google_vision_cli: ^2.1.0 copied to clipboard

A CLI tool for making API requests to Google Vision API using the google_vision package.

Google Vision CLI #

pub package

Google Vision CLI banner

A command-line interface for the Google Vision API. Wraps the google_vision Dart package to provide image detection and annotation from the terminal.

Table of Contents #

Quick Start #

# Install via Homebrew
brew tap cdavis-code/google-vision
brew install vision

# Run landmark detection
vision detect --image-file photo.jpg --features LANDMARK_DETECTION

Installation #

Homebrew #

brew tap cdavis-code/google-vision
brew install vision

# Run landmark detection
vision detect --image-file photo.jpg --features LANDMARK_DETECTION

Dart pub #

# Activate globally
dart pub global activate google_vision_cli

# Or run directly from the workspace
vision <command> [arguments]

Prerequisites #

  • A Google Cloud service account with the Vision API enabled
  • A JSON credentials file for the service account

Authentication #

The CLI authenticates via JWT only using a service account key file (API keys are not supported).

Getting a credentials file #

  1. Go to the Google Cloud Console, create or select a project
  2. Enable the Vision API
  3. Go to IAM & Admin → Service Accounts, create a new service account
  4. Select the service account, go to Keys → Add Key → Create New Key, choose JSON
  5. Download the key file and place it at the default path:
mkdir -p ~/.vision
mv ~/Downloads/project-key.json ~/.vision/credentials.json
# Place credentials at the default path
~/.vision/credentials.json

# Or specify a custom path
vision --credential-file /path/to/credentials.json <command>

Usage #

vision [global-options] <command> [command-options]

Global Options #

Option Default Description
--credential-file ~/.vision/credentials.json Path to JWT credentials JSON file
--log-level off Log level: all, debug, info, warning, error, off

Commands #

version #

Display the package name and version.

vision version

detect #

Run image detection and annotation for a batch of images or files.

Option Required Description
--image-file yes Path to the image file to process
--features yes Comma-separated list of detection types
--pages no Comma-separated list of page numbers for PDF/TIFF/GIF (max 5)
--max-results no Maximum results per feature (default: 10, max: 50)
# Label detection on an image
vision detect --image-file photo.jpg --features LABEL_DETECTION

# Multiple features on a single image
vision detect --image-file photo.jpg \
  --features LABEL_DETECTION,FACE_DETECTION,TEXT_DETECTION

# Process specific pages of a PDF
vision detect --image-file document.pdf \
  --features DOCUMENT_TEXT_DETECTION --pages 1,2,3

crop_hints #

Get crop suggestion vertices for an image.

Option Required Description
--image-file yes Path to the image file to process
--aspect-ratios no Comma-separated list of aspect ratios as floats (e.g. 1.33333 for 4:3)
--pages no Comma-separated list of page numbers for PDF/TIFF/GIF (max 5)
# Get default crop hints
vision crop_hints --image-file photo.jpg

# Specify desired aspect ratios
vision crop_hints --image-file photo.jpg \
  --aspect-ratios 1.33333,1.77778

Detect explicit content (adult, violence, medical, racy) in an image.

Option Required Description
--image-file yes Path to the image file to process
vision safe_search --image-file photo.jpg

highlight #

Draw bounding boxes around detected objects and save the result.

Option Required Description
--image-file yes Path to the image file to process
--output-file yes Path to save the annotated image
--features no Comma-separated list of detection types
--line-color no Box color: red, green, blue, white, black (default: red)
--max-results no Maximum results per feature (default: 10, max: 50)
# Highlight detected faces
vision highlight --image-file photo.jpg \
  --output-file annotated.jpg --features FACE_DETECTION

# Highlight objects and landmarks in blue
vision highlight --image-file photo.jpg \
  --output-file annotated.jpg \
  --features OBJECT_LOCALIZATION,LANDMARK_DETECTION \
  --line-color blue

markdown #

Convert the result of DOCUMENT_TEXT_DETECTION to a well-formatted markdown document. Walks the Page → Block → Paragraph → Word → Symbol hierarchy and emits headers, paragraphs, lists, tables, checkboxes, and image placeholders.

Option Required Description
--image-file yes Path to the image or PDF file to process
--output-file, -o no Path to write the markdown output (defaults to stdout)
--pages, -p no Comma-separated list of page numbers for PDF/TIFF/GIF (max 5)
--[no-]page-headers no Emit # Page N at the top of each page (default: on)
--[no-]image-placeholders no Emit placeholder links for BlockType.PICTURE blocks (default: on)
--[no-]detect-headers no Detect headers via relative symbol height (default: on)
--[no-]detect-lists no Detect bullet and ordered lists (default: on)
--[no-]detect-checkboxes no Detect checkbox glyphs and [ ] / [x] patterns (default: on)
# Convert an image to markdown on stdout
vision markdown --image-file form.jpg

# Convert pages 1 and 2 of a PDF to a file
vision markdown --image-file scan.pdf --pages 1,2 -o out.md

# Disable header detection and image placeholders
vision markdown --image-file scan.pdf --pages 1 \
  --no-detect-headers --no-image-placeholders -o out.md

score #

Get confidence scores for detected objects as a JSON array.

Option Required Description
--image-file yes Path to the image file to process
--features no Comma-separated list of detection types
--max-results no Maximum results per feature (default: 10, max: 50)
# Get face detection confidence scores
vision score --image-file photo.jpg \
  --features FACE_DETECTION

# Get label detection scores
vision score --image-file photo.jpg \
  --features LABEL_DETECTION

# Output: [0.98, 0.95, 0.87, ...]

Feature Types #

Feature Description
FACE_DETECTION Detect faces with facial attributes
LANDMARK_DETECTION Detect popular landmarks
LOGO_DETECTION Detect product logos
LABEL_DETECTION Label image content (objects, activities, etc.)
TEXT_DETECTION OCR — detect and extract text
DOCUMENT_TEXT_DETECTION Dense text/document OCR with page/block/paragraph structure
SAFE_SEARCH_DETECTION Adult/violent/explicit content likelihood
IMAGE_PROPERTIES Dominant colors and image attributes
CROP_HINTS Suggested crop region vertices
WEB_DETECTION Web references and similar images
PRODUCT_SEARCH Product search against a product set
OBJECT_LOCALIZATION Detect and locate multiple objects

See the Google Vision API documentation for details.

License #

This project is licensed under the terms included in the repository.

1
likes
160
points
257
downloads

Documentation

API reference

Publisher

verified publishercdavis.ca

Weekly Downloads

A CLI tool for making API requests to Google Vision API using the google_vision package.

Homepage
Repository (GitHub)
View/report issues

Topics

#api #image #cli #vision

Funding

Consider supporting this project:

www.buymeacoffee.com

License

MIT (license)

Dependencies

args, collection, dio, google_vision, image, loggy

More

Packages that depend on google_vision_cli