VisionFlow #

A modular and production-ready Flutter plugin for real-time sign language and gesture recognition using MediaPipe and custom ML models (PyTorch and TFLite).

Features #

Dynamic Detection: Configure exactly what to detect (Hands, Face, Pose).
Dual Backends: Seamlessly switch between PyTorch and TFLite models for inference.
Optimized Feature Extraction: Automatically processes and normalizes 330 landmarks (2x 21 hand points, 68 face points) across a dynamic sequence length to match your exact ML inputs.
Real-Time Streaming: Push live camera frames into the plugin and listen for prediction events.

Installation #

Add vision_flow to your pubspec.yaml:

dependencies:
  vision_flow: ^0.0.1

Android Configuration #

Ensure your android/build.gradle has the required repositories, and compileSdkVersion is at least 34 (36 recommended).

Usage #

1. Load the Model #

Before starting, load your pre-trained model. Place your model (.pt or .tflite) inside the assets/ folder and define it in your pubspec.yaml.

import 'package:vision_flow/vision_flow.dart';

// Load PyTorch Model
await VisionFlow.loadModel(
  path: "model.pt",
  backend: VisionFlowModelType.pytorch,
);

// OR Load TFLite Model
await VisionFlow.loadModel(
  path: "model.tflite",
  backend: VisionFlowModelType.tflite,
);

2. Configure the Pipeline #

Define what the pipeline should detect and the sequence length expected by your model:

await VisionFlow.configure(
  hands: true,
  face: true,
  pose: false,
  sequenceLength: 30, // 30 frames per prediction sequence
);

3. Listen to Predictions #

Subscribe to the prediction stream:

VisionFlow.predictions.listen((result) {
  print("Predicted Label: ${result.label} (Index: ${result.index})");
});

4. Process Camera Frames #

Push raw YUV frames from the camera plugin directly into VisionFlow:

await VisionFlow.processFrame(
  y: image.planes[0].bytes,
  u: image.planes[1].bytes,
  v: image.planes[2].bytes,
  width: image.width,
  height: image.height,
  yRowStride: image.planes[0].bytesPerRow,
  uvRowStride: image.planes[1].bytesPerRow,
  uvPixelStride: image.planes[1].bytesPerPixel!,
);

5. Cleanup #

Always dispose of the plugin resources when done:

await VisionFlow.dispose();

Advanced: Custom Models #

VisionFlow extracts raw coordinates and normalizes them frame-by-frame, centering them relative to the face nose tip point. The exact sequence shape constructed in the FrameBuffer before being sent to your model is (1, SequenceLength, 330).

Right Hand: 63 features (21 points * 3 coords)
Left Hand: 63 features (21 points * 3 coords)
Face: 204 features (68 points * 3 coords)

vision_flow 0.0.1
vision_flow: ^0.0.1 copied to clipboard

Metadata

VisionFlow #

Features #

Installation #

Android Configuration #

Usage #

1. Load the Model #

2. Configure the Pipeline #

3. Listen to Predictions #

4. Process Camera Frames #

5. Cleanup #

Advanced: Custom Models #

← Metadata

Publisher

Weekly Downloads

Metadata

License

Dependencies

More

vision_flow 0.0.1 vision_flow: ^0.0.1 copied to clipboard

Metadata

VisionFlow #

Features #

Installation #

Android Configuration #

Usage #

1. Load the Model #

2. Configure the Pipeline #

3. Listen to Predictions #

4. Process Camera Frames #

5. Cleanup #

Advanced: Custom Models #

← Metadata

Publisher

Weekly Downloads

Metadata

License

Dependencies

More

vision_flow 0.0.1
vision_flow: ^0.0.1 copied to clipboard