dart_tensor_preprocessing 0.5.1 copy "dart_tensor_preprocessing: ^0.5.1" to clipboard
dart_tensor_preprocessing: ^0.5.1 copied to clipboard

High-performance tensor preprocessing library for Flutter/Dart. NumPy-like transforms pipeline for ONNX Runtime inference.

dart_tensor_preprocessing #

Dart License PyTorch

Tensor preprocessing library for Flutter/Dart. NumPy-like transforms pipeline for ONNX Runtime, TFLite, and other AI inference engines.

Features #

  • PyTorch Compatible: Matches PyTorch/torchvision tensor operations
  • Non-blocking: Isolate-based async execution prevents UI jank
  • Type-safe: ONNX-compatible tensor types (Float32, Int64, Uint8, etc.)
  • Zero-copy: View/stride manipulation for reshape/transpose operations
  • Declarative: Chain operations into reusable pipelines

Installation #

dependencies:
  dart_tensor_preprocessing: ^0.5.1

Quick Start #

import 'package:dart_tensor_preprocessing/dart_tensor_preprocessing.dart';

// Create a tensor from image data (HWC format, Uint8)
final imageData = Uint8List.fromList([/* RGBA pixel data */]);
final tensor = TensorBuffer.fromUint8List(imageData, [height, width, channels]);

// Use a preset pipeline for ImageNet models
final pipeline = PipelinePresets.imagenetClassification();
final result = await pipeline.runAsync(tensor);

// result.shape: [1, 3, 224, 224] (NCHW, Float32, normalized)

Pipeline Presets #

Preset Output Shape Use Case
imagenetClassification() [1, 3, 224, 224] ResNet, VGG, etc.
objectDetection() [1, 3, 640, 640] YOLO, SSD
faceRecognition() [1, 3, 112, 112] ArcFace, FaceNet
clip() [1, 3, 224, 224] CLIP models
mobileNet() [1, 3, 224, 224] MobileNet family

Custom Pipeline #

final pipeline = TensorPipeline([
  ResizeOp(height: 224, width: 224),
  ToTensorOp(normalize: true),  // HWC -> CHW, scale to [0,1]
  NormalizeOp.imagenet(),       // ImageNet mean/std
  UnsqueezeOp.batch(),          // Add batch dimension
]);

// Sync execution
final result = pipeline.run(input);

// Async execution (runs in isolate)
final result = await pipeline.runAsync(input);

// Async with custom isolate threshold (default: 100,000 elements)
// Small tensors skip isolate overhead and run synchronously
final result = await pipeline.runAsync(input, isolateThreshold: 50000);

Available Operations #

Resize & Crop #

  • ResizeOp - Resize to fixed dimensions (nearest, bilinear, bicubic)
  • ResizeShortestOp - Resize preserving aspect ratio
  • CenterCropOp - Center crop to fixed dimensions
  • ClipOp - Element-wise value clamping (presets: unit, symmetric, uint8)
  • PadOp - Padding with multiple modes (constant, reflect, replicate, circular)
  • SliceOp - Python-like tensor slicing with negative index support

Normalization #

  • NormalizeOp - Channel-wise normalization (presets: ImageNet, CIFAR-10, symmetric)
  • ScaleOp - Scale values (e.g., [0-255] to [0-1])
  • BatchNormOp - Batch normalization for CNN inference (PyTorch compatible)
  • LayerNormOp - Layer normalization for Transformer inference (presets: BERT, BERT-Large)

Layout #

  • PermuteOp - Axis reordering (e.g., HWC to CHW)
  • ToTensorOp - HWC uint8 to CHW float32 with optional scaling
  • ToImageOp - CHW float32 to HWC uint8

Data Augmentation #

  • RandomCropOp - Random cropping with deterministic seed support
  • GaussianBlurOp - Gaussian blur using separable convolution

Utility #

  • concat() - Concatenates tensors along specified axis

Shape #

  • UnsqueezeOp - Add dimension
  • SqueezeOp - Remove size-1 dimensions
  • ReshapeOp - Reshape tensor (supports -1 for inference)
  • FlattenOp - Flatten dimensions

Type #

  • TypeCastOp - Convert between data types

Core Classes #

TensorBuffer #

Tensor with shape and stride metadata over physical storage.

// Create tensors
final zeros = TensorBuffer.zeros([3, 224, 224]);
final ones = TensorBuffer.ones([3, 224, 224], dtype: DType.float32);
final fromData = TensorBuffer.fromFloat32List(data, [3, 224, 224]);

// Access elements
final value = tensor[[0, 100, 100]];

// Zero-copy operations
final transposed = tensor.transpose([2, 0, 1]);  // Changes strides only
final squeezed = tensor.squeeze();

// Copy operations
final contiguous = tensor.contiguous();  // Force contiguous memory
final cloned = tensor.clone();

DType #

ONNX-compatible data types with onnxId for runtime integration.

DType.float32  // ONNX ID: 1
DType.int64    // ONNX ID: 7
DType.uint8    // ONNX ID: 2

BufferPool #

Memory pooling for buffer reuse, reducing GC pressure in hot paths.

final pool = BufferPool.instance;

// Acquire buffer (reuses from pool if available)
final buffer = pool.acquireFloat32(1000);

// ... use buffer ...

// Release back to pool for reuse
pool.release(buffer);

// Monitor pool usage
print('Pooled: ${pool.pooledCount} buffers, ${pool.pooledBytes} bytes');

Zero-Copy View Operations #

TensorBuffer extension methods for zero-copy tensor manipulation:

// Slice along first dimension (batch slicing)
final batch = tensor.sliceFirst(2, 5);  // Views elements 2..4

// Split tensor into views
final items = tensor.unbind(0);  // List of views along dim 0

// Select single index (reduces rank)
final first = tensor.select(0, 0);  // First item, shape reduced

// Narrow dimension
final narrowed = tensor.narrow(0, 1, 3);  // 3 elements starting at 1

// Format conversion without copying
final nhwc = nchwTensor.toChannelsLast();   // NCHW -> NHWC view
final nchw = nhwcTensor.toChannelsFirst();  // NHWC -> NCHW view

// Flatten to 1D view
final flat = tensor.flatten();

Memory Formats #

Format Layout Strides (for [1,3,224,224])
contiguous NCHW [150528, 50176, 224, 1]
channelsLast NHWC [150528, 1, 672, 3]

PyTorch Compatibility #

This library is designed to produce identical results to PyTorch/torchvision operations:

Operation PyTorch Equivalent
TensorBuffer.zeros() torch.zeros()
TensorBuffer.ones() torch.ones()
tensor.transpose() tensor.permute()
tensor.reshape() tensor.reshape()
tensor.squeeze() tensor.squeeze()
tensor.unsqueeze() tensor.unsqueeze()
tensor.sum() / sumAxis() tensor.sum()
tensor.mean() / meanAxis() tensor.mean()
tensor.min() / max() tensor.min() / max()
NormalizeOp.imagenet() transforms.Normalize(mean, std)
ResizeOp(mode: bilinear) F.interpolate(mode='bilinear')
ToTensorOp() transforms.ToTensor()
ClipOp(min, max) torch.clamp(min, max)
PadOp(mode: reflect) F.pad(mode='reflect')
SliceOp([(start, end, step)]) tensor[start:end:step]
concat(tensors, axis) torch.cat(tensors, dim)
RandomCropOp transforms.RandomCrop()
GaussianBlurOp transforms.GaussianBlur()
AddOp / SubOp torch.add() / torch.sub()
MulOp / DivOp torch.mul() / torch.div()
PowOp torch.pow()
AbsOp / NegOp torch.abs() / torch.neg()
SqrtOp / ExpOp / LogOp torch.sqrt() / exp() / log()
ReLUOp / LeakyReLUOp F.relu() / F.leaky_relu()
SigmoidOp / TanhOp torch.sigmoid() / torch.tanh()
SoftmaxOp F.softmax()
BatchNormOp torch.nn.BatchNorm2d (inference)
LayerNormOp torch.nn.LayerNorm
TensorBuffer.full() torch.full()
TensorBuffer.random() torch.rand()
TensorBuffer.randn() torch.randn()
TensorBuffer.eye() torch.eye()
TensorBuffer.linspace() torch.linspace()
TensorBuffer.arange() torch.arange()
tensor.select(dim, index) tensor.select(dim, index)
tensor.narrow(dim, start, len) tensor.narrow(dim, start, len)
tensor.unbind(dim) tensor.unbind(dim)
tensor.flatten() tensor.flatten()

Performance Benchmarks #

Run benchmarks with dart run benchmark/run_all.dart.

Zero-Copy Operations (O(1)) #

Operation Time Ops/sec
transpose() ~1µs 700K+
reshape() ~1µs 1.6M+
squeeze() <1µs 3.2M+
unsqueeze() ~1µs 780K+

Pipeline Performance #

Pipeline Input Shape Time
Simple (Normalize + Unsqueeze) [3, 224, 224] ~3.4ms
ImageNet Classification [3, 224, 224] ~3.0ms
Object Detection [3, 640, 640] ~25ms

Sync vs Async #

Execution 224x224 640x640
run() (sync) ~3.5ms ~29ms
runAsync() (isolate) ~11ms ~93ms
Isolate overhead ~7ms ~64ms

Note: Use runAsync() for large tensors or when UI responsiveness is critical.

Requirements #

  • Dart SDK ^3.0.0

License #

MIT

3
likes
160
points
681
downloads

Publisher

verified publisherbrodykim.work

Weekly Downloads

High-performance tensor preprocessing library for Flutter/Dart. NumPy-like transforms pipeline for ONNX Runtime inference.

Repository (GitHub)
View/report issues

Documentation

API reference

License

MIT (license)

Dependencies

image

More

Packages that depend on dart_tensor_preprocessing