MultiHeadAttention class

Implements the Multi-Head Self-Attention mechanism.

This layer runs multiple SingleHeadAttention heads in parallel, allowing the model to jointly attend to information from different representation subspaces. The outputs of the heads are concatenated and passed through a final linear projection.

This is the core component of the Transformer architecture. Implements the Multi-Head Self-Attention mechanism.

This layer runs multiple SingleHeadAttention heads in parallel, allowing the model to jointly attend to information from different representation subspaces. The outputs of the heads are concatenated and passed through a final linear projection.

This is the core component of the Transformer architecture.

Inheritance

Constructors

MultiHeadAttention(int dModel, int numHeads)

Properties

attentionHeads List<SingleHeadAttention>
getter/setter pair
dModel int
getter/setter pair
hashCode int
The hash code for this object.
no setterinherited
name String
A user-friendly name for the layer (e.g., 'dense', 'lstm').
getter/setter pairoverride-getter
numHeads int
getter/setter pair
parameters List<Tensor>
A list of all trainable tensors (weights and biases) in the layer.
no setteroverride
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
Wo Tensor<Matrix>
getter/setter pair

Methods

build(Tensor input) → void
Initializes the layer's parameters based on the shape of the first input.
override
call(Tensor input) Tensor
The public, callable interface for the layer.
inherited
forward(Tensor input) Tensor<Matrix>
The core logic of the layer's transformation.
override
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited