pose_detection 3.1.0
pose_detection: ^3.1.0 copied to clipboard
Pose, person and landmark detection using on-device TFLite models.
pose_detection
Flutter plugin for on-device, multi-person pose detection and landmark estimation using TensorFlow Lite. Uses YOLOv8n for person detection and Google's BlazePose for 33-keypoint landmark extraction.
Quick Start #
import 'dart:io';
import 'dart:typed_data';
import 'package:pose_detection/pose_detection.dart';
Future main() async {
// One-step construction and initialization
final PoseDetector detector = await PoseDetector.create(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
);
// Load and detect from image bytes
final Uint8List imageBytes = await File('image.jpg').readAsBytes();
final List<Pose> results = await detector.detect(imageBytes);
// Access results
for (final Pose pose in results) {
final BoundingBox bbox = pose.boundingBox;
print('Bounding box: (${bbox.left}, ${bbox.top}) → (${bbox.right}, ${bbox.bottom})');
print('Size: ${bbox.width} x ${bbox.height}, center: (${bbox.center.x}, ${bbox.center.y})');
if (pose.hasLandmarks) {
// Iterate over landmarks
for (final PoseLandmark lm in pose.landmarks) {
print('${lm.type}: (${lm.x.toStringAsFixed(1)}, ${lm.y.toStringAsFixed(1)}) vis=${lm.visibility.toStringAsFixed(2)}');
}
// Access landmarks individually
// See "Pose Landmark Types" section in README for full list of landmarks
final PoseLandmark? leftKnee = pose.getLandmark(PoseLandmarkType.leftKnee);
if (leftKnee != null) {
print('Left knee visibility: ${leftKnee.visibility.toStringAsFixed(2)}');
}
}
}
// Clean up
await detector.dispose();
}
Alternatively, construct and initialize separately if you need to configure between steps:
final PoseDetector detector = PoseDetector();
await detector.initialize(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
);
Refer to the sample code on the pub.dev example tab for a more in-depth example.
Pose Detection Modes #
This package supports two operation modes that determine what data is returned:
| Mode | Description | Output |
|---|---|---|
| boxesAndLandmarks (default) | Full two-stage detection (YOLO + BlazePose) | Bounding boxes + 33 landmarks |
| boxes | Fast YOLO-only detection | Bounding boxes only |
Use boxes-only mode for faster detection #
When you only need to detect where people are (without body landmarks), use PoseMode.boxes for better performance:
final PoseDetector detector = PoseDetector();
await detector.initialize(
mode: PoseMode.boxes, // Skip landmark detection
);
final List<Pose> results = await detector.detect(imageBytes);
for (final Pose pose in results) {
print('Person detected at: ${pose.boundingBox}');
print('Detection confidence: ${pose.score.toStringAsFixed(2)}');
// pose.hasLandmarks will be false
}
Bounding Boxes #
The boundingBox property returns a BoundingBox object representing the pose bounding box in absolute pixel coordinates. The BoundingBox provides convenient access to corner points, dimensions (width and height), and the center point.
Accessing Corners #
final BoundingBox boundingBox = pose.boundingBox;
// Access individual corners by name (each is a Point with x and y)
final Point topLeft = boundingBox.topLeft; // Top-left corner
final Point topRight = boundingBox.topRight; // Top-right corner
final Point bottomRight = boundingBox.bottomRight; // Bottom-right corner
final Point bottomLeft = boundingBox.bottomLeft; // Bottom-left corner
// Access coordinates
print('Top-left: (${topLeft.x}, ${topLeft.y})');
Additional Bounding Box Parameters #
final BoundingBox boundingBox = pose.boundingBox;
// Access dimensions and center
final double width = boundingBox.width; // Width in pixels
final double height = boundingBox.height; // Height in pixels
final Point center = boundingBox.center; // Center point
// Access coordinates
print('Size: ${width} x ${height}');
print('Center: (${center.x}, ${center.y})');
// Access all corners as a list (order: top-left, top-right, bottom-right, bottom-left)
final List<Point> allCorners = boundingBox.corners;
Pose Landmark Models #
Choose the model that fits your performance needs:
| Model | Speed | Accuracy |
|---|---|---|
| lite | Fastest | Good |
| full | Balanced | Better |
| heavy | Slowest | Best |
Pose Landmark Types #
Every pose contains up to 33 landmarks that align with the BlazePose specification:
- nose
- leftEyeInner
- leftEye
- leftEyeOuter
- rightEyeInner
- rightEye
- rightEyeOuter
- leftEar
- rightEar
- mouthLeft
- mouthRight
- leftShoulder
- rightShoulder
- leftElbow
- rightElbow
- leftWrist
- rightWrist
- leftPinky
- rightPinky
- leftIndex
- rightIndex
- leftThumb
- rightThumb
- leftHip
- rightHip
- leftKnee
- rightKnee
- leftAnkle
- rightAnkle
- leftHeel
- rightHeel
- leftFootIndex
- rightFootIndex
// Example: how to access specific landmarks
// PoseLandmarkType can be any of the 33 landmarks listed above.
final PoseLandmark? leftHip = pose.getLandmark(PoseLandmarkType.leftHip);
if (leftHip != null && leftHip.visibility > 0.5) {
// Pixel coordinates in original image space
print('Left hip position: (${leftHip.x}, ${leftHip.y})');
// Depth information (relative z-coordinate)
print('Left hip depth: ${leftHip.z}');
}
Drawing Skeleton Connections #
The package provides poseLandmarkConnections, a predefined list of landmark pairs that form the body skeleton. Use this to draw skeleton overlays:
import 'package:flutter/material.dart';
import 'package:pose_detection/pose_detection.dart';
class PoseOverlayPainter extends CustomPainter {
final Pose pose;
PoseOverlayPainter(this.pose);
@override
void paint(Canvas canvas, Size size) {
final Paint paint = Paint()
..color = Colors.green
..strokeWidth = 3
..strokeCap = StrokeCap.round;
// Draw all skeleton connections
for (final connection in poseLandmarkConnections) {
final PoseLandmark? start = pose.getLandmark(connection[0]);
final PoseLandmark? end = pose.getLandmark(connection[1]);
// Only draw if both landmarks are visible
if (start != null && end != null &&
start.visibility > 0.5 && end.visibility > 0.5) {
canvas.drawLine(
Offset(start.x, start.y),
Offset(end.x, end.y),
paint,
);
}
}
// Draw landmark points
for (final landmark in pose.landmarks) {
if (landmark.visibility > 0.5) {
canvas.drawCircle(
Offset(landmark.x, landmark.y),
5,
Paint()..color = Colors.red,
);
}
}
}
@override
bool shouldRepaint(covariant CustomPainter oldDelegate) => true;
}
The poseLandmarkConnections constant contains 27 connections organized by body region:
- Face: Eyes to nose, eyes to ears, mouth
- Torso: Shoulders and hips forming the core
- Arms: Shoulders → elbows → wrists → fingers (left and right)
- Legs: Hips → knees → ankles → feet (left and right)
Built-in Overlay Painters #
The package ships two ready-to-use CustomPainter implementations:
| Class | Use case |
|---|---|
MultiOverlayPainter |
Still images: scales detection coordinates to fit the widget |
CameraPoseOverlayPainter |
Live camera preview: handles coordinate mapping and optional front-camera horizontal mirroring |
// Still image overlay
CustomPaint(
foregroundPainter: MultiOverlayPainter(results: poses),
child: Image.memory(imageBytes),
)
// Live camera overlay (front camera, mirrored)
CustomPaint(
foregroundPainter: CameraPoseOverlayPainter(
poses: poses,
cameraSize: Size(cameraWidth.toDouble(), cameraHeight.toDouble()),
mirrorHorizontally: isFrontCamera,
),
child: CameraPreview(controller),
)
Live Camera Detection #
For real-time pose detection with a camera feed, use detectFromCameraImage. It auto-detects YUV420 (NV12 / NV21 / I420) and desktop single-plane 4-channel layouts, and the cvtColor, optional rotate, and maxDim downscale all run inside the detector's existing isolate on native platforms: the UI thread is never blocked by OpenCV work.
import 'package:camera/camera.dart';
import 'package:pose_detection/pose_detection.dart';
final detector = await PoseDetector.create(
landmarkModel: PoseLandmarkModel.lite, // lite model for higher FPS
);
final cameras = await availableCameras();
final camera = CameraController(
cameras.first,
ResolutionPreset.medium,
enableAudio: false,
imageFormatGroup: ImageFormatGroup.yuv420,
);
await camera.initialize();
camera.startImageStream((CameraImage image) async {
final poses = await detector.detectFromCameraImage(
image,
// rotation: CameraFrameRotation.cw90, // based on device orientation
maxDim: 640, // optional in-isolate downscale before inference
);
// Process poses...
});
Tips for camera detection:
detectFromCameraImagereplaces the oldpackYuv420+ manualcv.cvtColor+cv.rotatedance in one call; nocv.Maton the UI thread.- Pass
rotation:so the detector sees upright frames (Android back/front + device orientation logic); on iOS the camera plugin pre-rotates so this is often null. - Pass
maxDim:(e.g. 640) to downscale in-isolate; the detection model internally resizes to 256px, so full-res frames just waste IPC bandwidth. - For desktop single-plane frames,
isBgradefaults totruefor macOS camera frames. PassisBgra: falsefor Linux RGBA frames. - Use
PoseLandmarkModel.litefor fastest real-time performance. - Mirror the overlay on the front camera to match
CameraPreview's auto-mirrored texture. - For advanced use (e.g. reusing a frame across multiple detectors),
prepareCameraFrame(...)+detectFromCameraFrame(...)is the underlying two-step API.
See the full example app for a production implementation including orientation handling, mirror handling, and frame throttling.
Video Detection #
In addition to still images and live camera feeds, pose_detection supports frame-by-frame inference on video files. The example app includes a fully working VideoFileScreen that shows the end-to-end flow:
- Open the video with
cv.VideoCapture.fromFile(path)(powered by opencv_dart). - Read frames in a loop with
cap.read(), passing eachcv.Matdirectly todetector.detectFromMat(frame). - Draw results onto the same
Mat(bounding boxes + skeleton overlay). - Write the annotated frame to an output file with
cv.VideoWriter, preserving the original FPS and resolution. - Play back the result in-app with the
video_playerpackage.
final cap = cv.VideoCapture.fromFile(path);
final fps = cap.get(cv.CAP_PROP_FPS);
final width = cap.get(cv.CAP_PROP_FRAME_WIDTH).toInt();
final height = cap.get(cv.CAP_PROP_FRAME_HEIGHT).toInt();
final writer = cv.VideoWriter.fromFile(outPath, 'avc1', fps, (width, height));
cv.Mat? frame;
while (true) {
final (ok, mat) = cap.read(m: frame);
frame = mat;
if (!ok || frame.isEmpty) break;
final List<Pose> poses = await detector.detectFromMat(frame);
// draw poses on frame...
writer.write(frame);
}
cap.release();
writer.release();
The output is a standard H.264 MP4 with the pose overlay baked in. See VideoFileScreen in the example app for the full implementation including progress tracking, cancellation, temporal smoothing, and playback.
Notes:
- Video processing is CPU-bound and runs off the UI thread via the detector's isolate. The UI stays responsive.
- Use
PoseLandmarkModel.liteorPoseLandmarkModel.fullfor a better speed/accuracy tradeoff when processing long videos. - On Linux, GStreamer plugins are required to open MP4 files:
sudo apt install gstreamer1.0-libav gstreamer1.0-plugins-good gstreamer1.0-plugins-bad.
Background Processing #
On native platforms, inference runs automatically in a background isolate: the UI thread is never blocked during detection or landmark extraction. On Flutter Web, inference runs asynchronously through the browser JavaScript/WebGPU/WASM runtime. No special configuration is needed; PoseDetector handles the platform-specific execution path internally.
Advanced Usage #
Multi-person detection #
The detector automatically handles multiple people in a single image:
final List<Pose> results = await detector.detect(imageBytes);
print('Detected ${results.length} people');
for (int i = 0; i < results.length; i++) {
final Pose pose = results[i];
print('Person ${i + 1}:');
print('Bounding box: ${pose.boundingBox}');
print('Confidence: ${pose.score.toStringAsFixed(2)}');
print('Landmarks: ${pose.landmarks.length}');
}
Interpreter Pool: The detector maintains a pool of TensorFlow Lite interpreter instances for landmark extraction. Each interpreter adds ~10MB memory overhead.
final detector = PoseDetector();
await detector.initialize(
interpreterPoolSize: 3, // Number of interpreter instances
);
- Default pool size: 1
- When any hardware acceleration is active (auto, XNNPACK, or GPU), pool size is automatically forced to 1 to prevent thread contention
Detect from a file path #
detectFromFilepath reads the file and delegates to detect. Native-only (uses dart:io).
final List<Pose> poses = await detector.detectFromFilepath('/path/to/image.jpg');
Detect from raw pixel bytes (zero-copy) #
detectFromMatBytes accepts raw pixel data without constructing a cv.Mat first. Bytes are transferred to the background isolate via TransferableTypedData with no copy. Useful when you already have decoded pixel data from another source.
final List<Pose> poses = await detector.detectFromMatBytes(
pixelBytes, // Raw BGR pixel data
width: imageWidth,
height: imageHeight,
matType: 16, // CV_8UC3 (default)
);
Web (Flutter Web) #
This package supports Flutter Web using the same package import:
import 'package:pose_detection/pose_detection.dart';
Two web runtimes are available, selectable per PoseDetector:
- LiteRT.js with WebGPU delegate (default). Google's official web runtime via
flutter_litert ≥ 2.5.1. ~18× faster in real measurements (446 ms → 25 ms / call on the heavy BlazePose model with mixed single/multi-person images). Auto-loaded from CDN on first use, noweb/index.htmlchanges required. Prefers WebGPU; falls back to WASM automatically on unsupported browsers. tflite-js(CPU/WASM, legacy). PassuseLiteRt: falseto opt into the previous default. No additional CDN scripts beyond those already loaded.
The main difference from native is how you load images:
- The Quick Start example above uses
dart:io(File(...)), which is not available on web. - On web, load an image as
Uint8List(for example from a file picker, drag-and-drop, or network response) and calldetect(imageBytes). detectFromMat(...)(OpenCVcv.Mat) is native-only and is not available on web.interpreterPoolSizeandperformanceConfigare accepted for API compatibility but are ignored on web.
final detector = await PoseDetector.create(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
);
final List<Pose> poses = await detector.detect(imageBytes);
await detector.dispose();
Web (LiteRT.js + WebGPU, default) #
No extra configuration needed. LiteRT.js is the default runtime:
final detector = await PoseDetector.create(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
// liteRtAccelerator defaults to 'auto': prefers WebGPU, falls back to WASM.
);
liteRtAccelerator accepts:
| Value | Behavior |
|---|---|
'auto' (default) |
Try WebGPU; if compile fails (no navigator.gpu, or unsupported ops) fall back to WASM. |
'webgpu' |
Force WebGPU; same compile-time fallback to WASM if anything fails. |
'wasm' |
Force SIMD-optimized WASM. Use this to opt out of GPU even when available. |
The WASM fallback is still substantially faster than the legacy tflite-js path because LiteRT.js's WASM is SIMD-optimized.
To opt into the legacy tflite-js path, pass useLiteRt: false.
If you need to self-host the runtime (offline, strict CSP, or to pin a specific build), call flutter_litert's configureLiteRtLoader(moduleUrl: ..., wasmUrl: ...) before any PoseDetector.create, or set autoLoad: false and load it from your own <script> tag instead.
Benchmarks #
Heavy BlazePose model on macOS Chrome 147, 5 images, 10 timed iterations each, averaged over 2 runs (see runWebBenchmark.sh):
| Image | Detections | Default (tflite-js) | LiteRT.js webgpu | Speedup |
|---|---|---|---|---|
| pose1 | 1 | 357 ms | 20 ms | 17.8× |
| pose2 | 1 | 357 ms | 18 ms | 19.9× |
| pose3 | 2 | 430 ms | 23 ms | 18.7× |
| pose4 | 6 | 726 ms | 46 ms | 15.9× |
| pose5 | 1 | 360 ms | 17 ms | 20.7× |
| mean | 446 ms | 25 ms | ~18× |
Detection counts are identical between the two runtimes on every image.
Separate example_web app #
The repository keeps the browser demo in example_web/ (separate from example/) because the web sample uses browser-specific APIs (HTML file picker + canvas overlay) and UI flow. The demo uses the default 'auto' accelerator (WebGPU with WASM fallback). Copy from example_web/lib/main.dart as a starting point.
Run the web demo locally:
cd example_web
flutter pub get
flutter run -d chrome
Build for web:
cd example_web
flutter build web
Performance #
Hardware Acceleration #
The package automatically selects the best acceleration strategy for each platform:
| Platform | Default Delegate | Speedup | Notes |
|---|---|---|---|
| macOS | XNNPACK | 2-5x | SIMD vectorization (NEON on ARM, AVX on x86) |
| Linux | XNNPACK | 2-5x | SIMD vectorization |
| iOS | XNNPACK for YOLO, Metal GPU for landmarks | 2-4x | Avoids YOLO Metal precision inconsistencies while keeping GPU acceleration for landmarks |
| Android | XNNPACK | 2-5x | ARM NEON SIMD acceleration |
| Windows | XNNPACK | 2-5x | SIMD vectorization (AVX on x86) |
No configuration needed, just call initialize() and you get the optimal performance for your platform.
Advanced Performance Configuration #
// Auto mode (default), optimal for each platform
await detector.initialize();
// Force XNNPACK (all native platforms)
final detector = await PoseDetector.create(
performanceConfig: PerformanceConfig.xnnpack(numThreads: 4),
);
// Force GPU delegate
final detector = await PoseDetector.create(
performanceConfig: PerformanceConfig.gpu(),
);
// CPU-only (maximum compatibility)
final detector = await PoseDetector.create(
performanceConfig: PerformanceConfig.disabled,
);
Migration Guide #
3.0.0 breaking changes #
Configuration moved from constructor to initialize()
Configuration parameters are no longer accepted by PoseDetector(...). Use the no-argument constructor plus initialize(...), or keep using PoseDetector.create(...) for one-step construction.
// Before (2.x)
final detector = PoseDetector(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
);
// After (3.0)
final detector = PoseDetector();
await detector.initialize(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
);
// Or one step
final detector = await PoseDetector.create(
mode: PoseMode.boxesAndLandmarks,
landmarkModel: PoseLandmarkModel.heavy,
);
detectFromMat signature changed
The imageWidth and imageHeight named arguments have been removed. Dimensions are now read directly from the Mat.
// Before (2.x)
final poses = await detector.detectFromMat(
mat,
imageWidth: mat.cols,
imageHeight: mat.rows,
);
// After (3.0)
final poses = await detector.detectFromMat(mat);
Native detect(...) decode failures now throw
On native platforms, undecodable image bytes now propagate as an error instead of returning an empty list. Wrap detect(...) in a try/catch if your 2.x call site depended on silent failure. On web, decode failure still returns an empty list because browser image decode failure does not throw.
try {
final poses = await detector.detect(imageBytes);
// Process poses...
} on FormatException {
// Handle invalid or unsupported image bytes.
}
Platform note: repeated initialize() calls #
Native detectors throw StateError if initialize() is called twice without dispose(). The web detector disposes existing models and reinitializes.