forward method
Forward pass for the face detection and recognition model.
Takes a flattened list of image pixel data. Returns a Map containing lists of object predictions (boxes, logits, embeddings).
Implementation
Map<String, List<ValueVector>> forward(List<double> imageData) {
// Get contextualized features from the ViT backbone
final List<ValueVector> encodedFeatures = backbone.forward(imageData);
// For this simple detection head, we'll use the CLS token's output
// as the global image representation for prediction.
final ValueVector clsFeature = encodedFeatures[0];
// Pass the CLS feature to the detection head, which will produce multiple predictions
final Map<String, List<ValueVector>> predictions =
detectionHead.forward(clsFeature);
return predictions;
}