forward method
idx: List of token IDs. tracker: List to collect intermediate tensors for disposal.
Implementation
Tensor forward(List<int> idx, List<Tensor> tracker) {
final int T = idx.length;
if (T > blockSize) {
throw ArgumentError("Sequence length $T exceeds block size $blockSize");
}
// 1. GPU-Accelerated Embedding Lookup
// This calls your 'embedding_forward' CUDA kernel.
// No loops, just one kernel launch.
final x = Tensor.embeddings(idx, wte, wpe);
tracker.add(x);
return _processThroughBlocks(x, tracker);
}