forward method

Map<String, List<ValueVector>> forward(
  1. List<double> imageData
)

Forward pass for the object detector.

Takes a flattened list of image pixel data. Returns a Map containing lists of object predictions.

The encodedFeatures from the backbone will contain: CLS_token_embedding, patch_embedding_1, patch_embedding_2, ... For this simple head, we'll use the CLS token's output.

Implementation

Map<String, List<ValueVector>> forward(List<double> imageData) {
  // Get contextualized features from the ViT backbone
  final List<ValueVector> encodedFeatures = backbone.forward(imageData);

  // For this simple detection head, we'll use the CLS token's output
  // as the global image representation for prediction.
  final ValueVector clsFeature = encodedFeatures[0];

  // Pass the CLS feature to the detection head, which will produce multiple predictions
  final Map<String, List<ValueVector>> predictions =
      detectionHead.forward(clsFeature);

  return predictions;
}