MultiHeadAttention class

Implements the Multi-Head Self-Attention mechanism.

This layer runs multiple SingleHeadAttention heads in parallel, allowing the model to jointly attend to information from different representation subspaces. The outputs of the heads are concatenated and passed through a final linear projection.

This is the core component of the Transformer architecture. Implements the Multi-Head Self-Attention mechanism.

This is the core component of the Transformer architecture.

Properties

attentionHeads ↔ List<SingleHeadAttention>

getter/setter pair

dModel ↔ int

getter/setter pair

hashCode → int

The hash code for this object.

no setterinherited

name ↔ String

A user-friendly name for the layer (e.g., 'dense', 'lstm').

getter/setter pairoverride-getter

numHeads ↔ int

getter/setter pair

parameters → List<Tensor>

A list of all trainable tensors (weights and biases) in the layer.

no setteroverride

runtimeType → Type

A representation of the runtime type of the object.

no setterinherited

Wo ↔ Tensor<Matrix>

getter/setter pair

Methods

build(Tensor input) → void

Initializes the layer's parameters based on the shape of the first input.

override

call(Tensor input) → Tensor

The public, callable interface for the layer.

inherited

forward(Tensor input) → Tensor<Matrix>

The core logic of the layer's transformation.

override

noSuchMethod(Invocation invocation) → dynamic

Invoked when a nonexistent method or property is accessed.

inherited

toString() → String

A string representation of this object.

inherited

Constructors

Properties

Methods

Operators

multiHeadAttentionLayer library