donut.dart

donut library

Donut — OCR-free Document Understanding Transformer.

Pure Dart implementation of the Donut model for end-to-end document understanding without OCR.

import 'package:dart_mupdf_donut/donut.dart';

See the DonutModel class for usage examples.

Classes

BartAttention: Scaled dot-product attention used in BART decoder.
BartDecoder: Complete BART decoder for Donut.
BartDecoderLayer: A single BART decoder layer.
Conv2d: 2D Convolution layer.
DonutConfig: Configuration class for the Donut model.
DonutImageUtils: Image preprocessing pipeline for Donut.
DonutModel: The complete Donut model combining Swin Transformer encoder and BART decoder.
DonutResult: Result of a Donut inference pass.
DonutTokenizer: Tokenizer for Donut model.
DonutWeightLoader: Loads pretrained weights into a Donut model.
Dropout: Dropout layer — identity in inference mode.
Embedding: Lookup table embedding layer.
FeedForward: Position-wise feed-forward network: Linear → GELU → Linear
GELU: GELU activation function (Gaussian Error Linear Unit).
KVCache: Key-value cache for auto-regressive decoding.
LayerNorm: Layer normalization over the last dimension.
Linear: Fully connected (linear) layer: y = x @ W^T + b
MultiHeadAttention: Scaled dot-product multi-head attention.
PatchEmbed: Splits the input image into non-overlapping patches and projects them to an embedding space using a convolution.
PatchMerging: Downsamples the spatial resolution by 2x and doubles the channel dimension.
ReLU: ReLU activation function.
Softmax: Softmax activation along a dimension.
SwinEncoder: Complete Swin Transformer encoder as used in Donut.
SwinLayer: A single Swin Transformer stage consisting of multiple blocks and an optional patch merging downsample layer.
SwinTransformerBlock: A single Swin Transformer block with window attention.
Tensor: An N-dimensional tensor backed by a flat Float32List.
WeightExportGuide: Utility for converting PyTorch/HuggingFace model weights to a format suitable for the Dart Donut model.
WindowAttention: Window-based multi-head self attention (W-MSA / SW-MSA).

donut library

Classes

dart_mupdf_donut package

donut library