VideoTransformer class
A Transformer model adapted for video classification (e.g., action recognition).
This model expects pre-extracted video frame or clip embeddings as a sequence of ValueVectors. These embeddings could come from a pre-trained CNN (like ResNet) or a Vision Transformer applied per frame.
Constructors
Properties
- embedSize → int
-
final
- frameEmbedDim → int
-
final
- frameProjection → Layer?
-
final
- hashCode → int
-
The hash code for this object.
no setterinherited
- maxVideoSequenceLength → int
-
final
- mlpHead → Layer
-
final
- numClasses → int
-
final
- numHeads → int
-
final
- numLayers → int
-
final
-
positionEmbeddings
→ List<
ValueVector> -
final
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
- transformerEncoder → TransformerEncoder
-
final
Methods
-
forward(
List< ValueVector> videoEmbeddings) → List<Value> - The forward pass for the Video Transformer.
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
parameters(
) → List< Value> -
override
-
toString(
) → String -
A string representation of this object.
inherited
-
zeroGrad(
) → void -
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited