transformer/example3 library

Classes

SGD
A simple Stochastic Gradient Descent (SGD) optimizer.

Functions

exampleLargerTransformerTraining() → void
exampleLayerNorm() → void
exampleMultiHeadAttention() → void
exampleSelfAttention() → void
exampleSequenceGeneration() → void
exampleValueMatrix() → void
main() → void