DonutImageUtils class
Image preprocessing pipeline for Donut.
Converts raw image files or decoded images into normalized tensors suitable for the Swin Transformer encoder.
The pipeline follows the original Donut preprocessing:
- Decode image bytes to RGB
- Optionally rotate if height > width (align long axis)
- Resize to target size while maintaining aspect ratio
- Pad to exact target dimensions with white pixels
- Normalize with ImageNet mean/std
- Convert to tensor (batch, channels, height, width)
Constructors
Properties
- hashCode → int
-
The hash code for this object.
no setterinherited
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
Methods
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited
Static Methods
-
describePipeline(
DonutConfig config) → String - Get a summary of image preprocessing that will be applied.
-
fromPixels(
List< int> pixels, int width, int height) → Tensor - Create a tensor from RGB pixel values (not normalized).
-
preprocessBytes(
List< int> imageBytes, DonutConfig config) → Tensor - Preprocess raw image bytes for Donut inference.
-
preprocessImage(
Image image, DonutConfig config) → Tensor - Preprocess a decoded image for Donut inference.
-
tensorToImage(
Tensor tensor) → Image - Convert a tensor back to an image (for debugging/visualization).