dart_tensor_preprocessing 0.5.0
dart_tensor_preprocessing: ^0.5.0 copied to clipboard
High-performance tensor preprocessing library for Flutter/Dart. NumPy-like transforms pipeline for ONNX Runtime inference.
Changelog #
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.5.0 - 2026-01-10 #
Added #
-
BatchNormOp - Batch normalization for CNN inference (
batch_norm_op.dart):- Full PyTorch-compatible
torch.nn.BatchNorm2dimplementation - Pre-computed scale/shift coefficients for efficient inference:
y = x * scale + shift - Supports 3D
[C,H,W]and 4D[N,C,H,W]tensors BatchNormOp.fromStateDict()factory for loading PyTorch weights- Dtype-specialized loops for Float32/Float64
- In-place support via
applyInPlace()
- Full PyTorch-compatible
-
LayerNormOp - Layer normalization for Transformer inference (
layer_norm_op.dart):- Full PyTorch-compatible
torch.nn.LayerNormimplementation - Normalizes over last N dimensions (e.g.,
[768]for BERT) - Welford's algorithm for numerically stable mean/variance computation
LayerNormOp.bert()andLayerNormOp.bertLarge()factory presetsLayerNormOp.fromStateDict()factory for loading PyTorch weights- Dtype-specialized loops for Float32/Float64
- In-place support via
applyInPlace()
- Full PyTorch-compatible
PyTorch Compatibility #
| Operation | PyTorch Equivalent |
|---|---|
BatchNormOp |
torch.nn.BatchNorm2d (inference) |
LayerNormOp |
torch.nn.LayerNorm |
0.4.1 - 2026-01-09 #
Performance Optimizations #
-
Dtype-specialized loops: Hot paths in transform operations now use dtype-specific code paths with direct
Float32List/Float64Listaccess, avoiding per-element switch overhead:NormalizeOp._normalize3D(),NormalizeOp._normalize4D()ScaleOp._scale()ClipOp._clip()GaussianBlurOp._applySeparableBlur()ResizeOp._resizeNearest(),_resizeBilinear(),_resizeBicubic()CenterCropOp._crop3D(),_crop4D()concat()with optimized axis=0 bulk copy
-
Clone-Before-Modify optimization:
ClipOp.apply()now avoids double copy by checkingisContiguousbefore deciding whether toclone()orcontiguous() -
Isolate threshold:
TensorPipeline.runAsync()now accepts optionalisolateThresholdparameter (default: 100,000 elements). Small tensors skip isolate overhead and run synchronously -
Buffer reuse:
GaussianBlurOpnow pre-allocates and reuses temp buffer across channels, reducing allocations -
Concat linear copy:
concat()now uses pre-computed strides for linear index calculation instead of recursive index computation. Axis=0 concatenation of contiguous tensors uses bulksetRange()copy -
Loop unrolling:
ResizeOp._resizeBicubic()unrolls 4x4 kernel with pre-computed weights and indices
0.4.0 - 2026-01-09 #
Added #
- Arithmetic Operations (
arithmetic_op.dart):AddOp- Element-wise addition (scalar or tensor)SubOp- Element-wise subtraction (scalar or tensor)MulOp- Element-wise multiplication (scalar or tensor)DivOp- Element-wise division (scalar or tensor)PowOp- Element-wise power operation
- Math Operations (
math_op.dart):AbsOp- Element-wise absolute valueNegOp- Element-wise negationSqrtOp- Element-wise square rootExpOp- Element-wise exponential (e^x)LogOp- Element-wise natural logarithm
- Activation Functions (
activation_op.dart):ReLUOp- Rectified Linear UnitLeakyReLUOp- Leaky ReLU with configurable negative slopeSigmoidOp- Sigmoid activationTanhOp- Hyperbolic tangent activationSoftmaxOp- Softmax along specified axis
- TensorBuffer Factory Methods:
TensorBuffer.full()- Create tensor filled with specified valueTensorBuffer.random()- Create tensor with uniform random values [0, 1)TensorBuffer.randn()- Create tensor with standard normal distributionTensorBuffer.eye()- Create identity matrix (supports rectangular)TensorBuffer.linspace()- Create tensor with evenly spaced valuesTensorBuffer.arange()- Create tensor with sequence values
- Utility Libraries (
lib/src/utils/):index_utils.dart- Index manipulation utilities (reflectIndex, replicateIndex, circularIndex)validation_utils.dart- Common tensor validation patterns
Changed #
- Exception Consistency:
TensorStorage._checkBounds()now throwsIndexOutOfBoundsExceptioninstead ofRangeErrorfor consistent exception handling across the library
Internal #
- Extracted duplicate
_reflectIndexcode frompad_op.dartandaugmentation_op.dartinto shared utility - Added
TensorValidationextension withrequireRank3Or4(),requireExactRank(),requireMinRank()methods
0.3.1 - 2026-01-08 #
Added #
- Performance benchmark suite (
benchmark/directory):tensor_creation_benchmark.dart- Tensor creation performancetensor_ops_benchmark.dart- Zero-copy and copy operationspipeline_benchmark.dart- Pipeline sync/async comparisonmemory_benchmark.dart- Memory usage measurementrun_all.dart- Unified benchmark runnerutils/benchmark_utils.dart- Benchmark utilities
Fixed #
- Removed unused variables in benchmark files
- Fixed lint issues in benchmark files
0.3.0 - 2026-01-08 #
Added #
ClipOp- Element-wise value clamping with factory presets (unit, symmetric, uint8)PadOp- Padding with multiple modes (constant, reflect, replicate, circular)SliceOp- Python-like tensor slicing with support for negative indices and stepsRandomCropOp- Random cropping for data augmentation with deterministic seed supportGaussianBlurOp- Gaussian blur using separable convolution with factory presetsconcat()- Utility function for tensor concatenation along specified axis
Fixed #
concat()axis-based copy logic now correctly handles multi-axis concatenation
Changed #
- BREAKING: Unified exception handling across the library
- All exceptions now extend
TensorExceptionsealed class ArgumentError→ShapeMismatchException,InvalidParameterExceptionRangeError→IndexOutOfBoundsException
- All exceptions now extend
0.2.0 - 2026-01-04 #
Added #
IndexOutOfBoundsException- Thrown when an index or axis is out of valid rangeDTypeMismatchException- Thrown when tensor data types do not match
Changed #
- BREAKING: Unified exception handling across the library
- All exceptions now extend
TensorExceptionsealed class ArgumentError→ShapeMismatchException,InvalidParameterExceptionRangeError→IndexOutOfBoundsExceptionStateError→NonContiguousException,DTypeMismatchException
- All exceptions now extend
- Shape validation now happens before buffer creation in
zeros()andones()
Migration Guide #
If you were catching standard Dart exceptions, update your code:
| Before | After |
|---|---|
on RangeError |
on IndexOutOfBoundsException |
on ArgumentError |
on ShapeMismatchException or on InvalidParameterException |
on StateError |
on NonContiguousException or on DTypeMismatchException |
0.1.4 - 2026-01-04 #
Added #
- Reduction operations for
TensorBuffer:sum()- Returns the sum of all elementsmean()- Returns the arithmetic mean of all elementsmin()- Returns the minimum valuemax()- Returns the maximum value
- Axis-wise reduction operations:
sumAxis(int axis, {bool keepDims})- Sum along a specific axismeanAxis(int axis, {bool keepDims})- Mean along a specific axisminAxis(int axis, {bool keepDims})- Min along a specific axismaxAxis(int axis, {bool keepDims})- Max along a specific axis
- Support for negative axis indexing in axis-wise operations
- Comprehensive test coverage for all reduction operations (49 tests)
0.1.3 - 2026-01-03 #
0.1.1 - 2025-12-27 #
Added #
- Comprehensive dartdoc comments for all public API elements
- Library-level documentation with usage examples
0.1.0 - 2025-12-27 #
Added #
-
Core tensor operations
TensorBufferwith shape, strides, and view/storage separationTensorStoragefor immutable typed data wrapperDTypeenum with ONNX-compatible data types
-
Transform operations
ResizeOpwith nearest, bilinear, bicubic interpolationResizeShortestOpfor aspect-ratio preserving resizeCenterCropOpfor center croppingNormalizeOpwith ImageNet, CIFAR-10, symmetric presetsScaleOpfor value scalingPermuteOpfor axis reorderingToTensorOpfor HWC uint8 to CHW float32 conversionToImageOpfor CHW float32 to HWC uint8 conversionUnsqueezeOp,SqueezeOp,ReshapeOp,FlattenOpfor shape manipulationTypeCastOpfor dtype conversion
-
Pipeline system
TensorPipelinefor chaining operationsPipelinePresetswith ImageNet, ResNet, YOLO, CLIP, ViT, MobileNet presets- Async execution via
Isolate.run
-
Zero-copy operations
transpose()via stride manipulationsqueeze(),unsqueeze()as shape-only changes