DonutImageUtils class

Image preprocessing pipeline for Donut.

Converts raw image files or decoded images into normalized tensors suitable for the Swin Transformer encoder.

The pipeline follows the original Donut preprocessing:

  1. Decode image bytes to RGB
  2. Optionally rotate if height > width (align long axis)
  3. Resize to target size while maintaining aspect ratio
  4. Pad to exact target dimensions with white pixels
  5. Normalize with ImageNet mean/std
  6. Convert to tensor (batch, channels, height, width)

Constructors

DonutImageUtils()

Properties

hashCode int
The hash code for this object.
no setterinherited
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited

Static Methods

describePipeline(DonutConfig config) String
Get a summary of image preprocessing that will be applied.
fromPixels(List<int> pixels, int width, int height) Tensor
Create a tensor from RGB pixel values (not normalized).
preprocessBytes(List<int> imageBytes, DonutConfig config) Tensor
Preprocess raw image bytes for Donut inference.
preprocessImage(Image image, DonutConfig config) Tensor
Preprocess a decoded image for Donut inference.
tensorToImage(Tensor tensor) → Image
Convert a tensor back to an image (for debugging/visualization).

Constants

mean → const List<double>
ImageNet normalization mean (RGB).
std → const List<double>
ImageNet normalization std (RGB).