vision_ai 0.4.0
vision_ai: ^0.4.0 copied to clipboard
On-device hand gesture recognition and facial emotion detection for Flutter. Runs at 25+ FPS with zero cloud dependencies.
0.4.0 #
Features — body pose engine #
- On-device body pose detection. New
PoseConfigenables the MediaPipe Pose Landmarker (LIVE_STREAM) on the live feed; each detected pose carries a 33-point skeleton via the newPoseResult— normalizedlandmarks(image coords) plusworldLandmarks(metres, origin at the hip midpoint). NormalizedLandmarkgains optionalvisibility/presence(populated for pose; null for hand — additive, no breaking change).VisionResultgainsposes,hasPoses, andprimaryPose;ModelPathsgainsposeModel.- New
VisionAi.posefactory andupdatePoseConfig(PoseConfig)to hot-swapnumPoses/ confidence thresholds while running (the model itself is start-time-only). - Native: new
PoseDetectionProcessor(Android + iOS) wrapping the MediaPipe Pose Landmarker with GPU→CPU fallback. The pose model path arrives via theposeModelPathchannel key (enablePose+numPoses/minPose*Confidence). - The ergonomic API, bundled pose model, pure-Dart analytics (joint angle, rep counter, posture, fall), and skeleton overlay live in the companion
vision_ai_posepackage.
Notes #
- Pose landmarks cross the channel packed as a stride-5
Float64List([x, y, z, visibility, presence]) plus a stride-3 worldFloat64List; the samesetModelAssetBuffer(Android) /modelAssetPath(iOS) de-bundled loading as the other engines. PoseLandmarker ships in the already-bundledtasks-vision/MediaPipeTasksVision— no new native dependency.
0.3.0 #
Breaking #
- Core no longer bundles any ML models. The hand-gesture and face-emotion models are loaded from file paths supplied at runtime via a new required
modelsparameter (ModelPaths) onVisionAi/VisionAi.hand/VisionAi.face. Obtain the paths from the newvision_ai_modelspackage (await VisionAiModels.ensureLoaded()), or build aModelPathsyourself. This keeps an app's footprint to only the models it uses. - The
startCameraplatform-channel call now carrieshandModelPath/emotionModelPath; native fails with aMODEL_PATH_MISSINGerror if a required path is absent.
Features — animal detection engine #
- On-device animal detection. New
AnimalConfigenables the MediaPipe Object Detector (LIVE_STREAM) on the live feed; each detection carries a COCO class label, score, and bounding box via the newAnimalResult. - Breed/species classification.
AnimalConfig.detectBreedruns a MediaPipe Image Classifier on each classifiable detection's box region (crop-free, via region-of-interest), attachingbreed/breedConfidence/breedTopKto the result. VisionResultgainsanimals,hasAnimals, andprimaryAnimal.- New
updateAnimalConfig(AnimalConfig)hot-swaps detection thresholds, category filters, and the breed toggle while running (the model itself is start-time-only). - Full detector/classifier capability surface exposed via
AnimalConfig:breedScoreThreshold,breedAllowedCategories,breedDeniedCategories,displayNamesLocale,rotationDegrees(start-time-only), andbreedClassifiableCategoriesnow defaults to{cat, dog, bird}.AnimalResultcarriescategoryIndexanddisplayName. - Native: new
AnimalDetectionProcessor(Android + iOS) wrapping the MediaPipe Object Detector with GPU→CPU fallback, plusAnimalBreedClassifier. The animal model path arrives via theanimalModelPath/breedModelPathchannel keys. - The ergonomic API, bundled animal models, pure-Dart analytics (tracking, counting, presence, motion, proximity, zones), and overlay widgets live in the companion
vision_ai_animalspackage.
Notes #
- Android loads MediaPipe models via
setModelAssetBuffer(a directByteBufferread from the file) sincesetModelAssetPathonly resolves bundled assets; iOS passes the absolute path throughmodelAssetPath.
Platforms #
- iOS (hand, face, and the new animal engine) is verified working on a physical device (iPhone 16 Pro Max). iOS moves from Beta to Stable; broader device coverage welcome.
0.2.0 - 2026-05-30 #
Features #
- Live camera switching —
switchCamera()now flips between the front and back camera in place while running, with no stop/start cycle. The preview keeps rendering (the texture stays valid) and detection continues uninterrupted. Previously this was effectively a no-op that required a manual restart.
Performance (iOS) #
- Hand gesture recognition now uses the GPU (Metal) delegate with automatic CPU fallback, matching Android (was CPU-only).
- Emotion preprocessing reuses per-frame buffers via
PixelBufferPooland fills the TFLite input buffer in place, removing repeated per-frame allocations in the face pipeline.
Behavior #
detectLandmarksis now consistently a start-time-only option on both platforms:updateFaceConfigpreserves the value set atstartCamerarather than changing it (Android could previously reset it at runtime). Toggle it viastartCamera/restart.
Fixes & cleanup #
- Removed dead
CameraManager.stop()on Android (stopCameraalready routes throughrelease()). - Clarified in docs that
minEmotionConfidenceis currently stored but not enforced (no behavior change).
iOS note: these iOS items have since been verified working on a physical device (iPhone 16 Pro Max). Coverage across more iOS models is still welcome.
0.1.1 - 2026-05-25 #
Fixes #
- Fixed release build crash caused by R8 code shrinking stripping MediaPipe's stack-walking classes — added ProGuard consumer rules
- Fixed
updateFaceConfigsilently resettingdetectLandmarksto false — now properly preserves the value - Fixed
onDetachedFromEngineclosing ML processors on the main thread instead of the analysis thread — prevents race condition crashes - Fixed
EmotionClassifier.loadModelFile()leaking AssetFileDescriptor and FileInputStream — now uses Kotlin.use {}auto-close - Fixed missing
ON_CREATElifecycle event inPluginLifecycleOwner— prevents potential crashes on some AndroidX versions - Fixed missing
import Flutterin iOSHandGestureProcessor.swiftandFaceDetectionProcessor.swift - Removed dead code (
pixelBufferToSampleBufferalways returning nil on iOS)
Improvements #
- Added comprehensive inline comments to all Kotlin, Swift, and Dart source files
- Replaced
setStatewithValueNotifier+ValueListenableBuilderin example app - Settings panel now uses grouped cards (Hand Detection, Face Detection, Camera, Overlays)
- Disabling hand/face detection now hides related settings and resets sub-features
- Updated README with detailed API documentation, use cases, and release build instructions
- Added iOS setup instructions (NSCameraUsageDescription)
- Added SVG media files for documentation (hand skeleton, face detection, architecture, feature banner)
0.1.0 - 2026-05-24 #
Hand Detection #
- 8 built-in gestures via MediaPipe Gesture Recognizer (fist, open palm, peace, thumbs up/down, pointing up, I love you)
- 5 custom gestures via finger state pattern matching (ok, counting 1-5)
- User-defined custom gestures with wildcard support
- Per-gesture confidence filtering (allow/deny lists, per-gesture thresholds)
- 21 hand landmarks (normalized + world coordinates in meters)
- Per-finger state tracking (extended/closed)
- Hand bounding box (computed from landmarks)
- Hand motion velocity and direction tracking
- Two-hand interaction detection (pinch, clap, touching)
- World coordinate measurements (pinch distance, hand span in cm)
Face Detection #
- 7 emotion classification (angry, disgusted, fearful, happy, sad, surprised, neutral) via ML Kit + TFLite
- 15 face contour types (full face mesh including cheek centers)
- 10 face landmark points (eyes, nose, mouth, ears, cheeks)
- Face tracking with stable IDs across frames
- Blink detection from eye open probability transitions
- Head nod/shake detection from Euler angle oscillations
- Face distance estimation from bounding box geometry (pinhole camera model)
- Attention scoring (eye openness + face orientation + head stability)
- Accurate detection mode (ML Kit PERFORMANCE_MODE_ACCURATE)
Performance #
- 25-30 FPS real-time processing on mid-range devices
- GPU acceleration with automatic CPU fallback
- Bitmap pooling for reduced GC pressure
- Emission throttling via
CameraConfig.maxResultsPerSecond - On-device processing — zero cloud dependencies
Platform Support #
- Android (Kotlin, CameraX, MediaPipe, ML Kit, TFLite)
- iOS (Swift, AVFoundation, MediaPipe, ML Kit, TFLite)
UI Package (vision_ai_flutter) #
VisionAiCameraViewcomposite widget with configurable overlays- Hand landmark skeleton painter
- Hand bounding box painter
- Face bounding box painter
- Face contour painter (15 types)
- Gesture label and emotion indicator widgets
- Configurable overlay styles (colors, line widths)