donut library

Donut — OCR-free Document Understanding Transformer.

Pure Dart implementation of the Donut model for end-to-end document understanding without OCR.

import 'package:dart_mupdf_donut/donut.dart';

See the DonutModel class for usage examples.

Classes

BartAttention
Scaled dot-product attention used in BART decoder.
BartDecoder
Complete BART decoder for Donut.
BartDecoderLayer
A single BART decoder layer.
Conv2d
2D Convolution layer.
DonutConfig
Configuration class for the Donut model.
DonutImageUtils
Image preprocessing pipeline for Donut.
DonutModel
The complete Donut model combining Swin Transformer encoder and BART decoder.
DonutResult
Result of a Donut inference pass.
DonutTokenizer
Tokenizer for Donut model.
DonutWeightLoader
Loads pretrained weights into a Donut model.
Dropout
Dropout layer — identity in inference mode.
Embedding
Lookup table embedding layer.
FeedForward
Position-wise feed-forward network: Linear → GELU → Linear
GELU
GELU activation function (Gaussian Error Linear Unit).
KVCache
Key-value cache for auto-regressive decoding.
LayerNorm
Layer normalization over the last dimension.
Linear
Fully connected (linear) layer: y = x @ W^T + b
MultiHeadAttention
Scaled dot-product multi-head attention.
PatchEmbed
Splits the input image into non-overlapping patches and projects them to an embedding space using a convolution.
PatchMerging
Downsamples the spatial resolution by 2x and doubles the channel dimension.
ReLU
ReLU activation function.
Softmax
Softmax activation along a dimension.
SwinEncoder
Complete Swin Transformer encoder as used in Donut.
SwinLayer
A single Swin Transformer stage consisting of multiple blocks and an optional patch merging downsample layer.
SwinTransformerBlock
A single Swin Transformer block with window attention.
Tensor
An N-dimensional tensor backed by a flat Float32List.
WeightExportGuide
Utility for converting PyTorch/HuggingFace model weights to a format suitable for the Dart Donut model.
WindowAttention
Window-based multi-head self attention (W-MSA / SW-MSA).