On-the-Fly Point Annotation for Fast Medical Video Labeling
Meyer Adrien, Mazellier Jean-Paul, Jeremy Dana, Nicolas Padoy
TL;DR
This work tackles the costly process of bounding-box annotation in medical video data by introducing an on-the-fly point annotation (OTF) approach that preserves continuous labeling during live video viewing. OTF leverages point annotations fed into point-to-box teacher models (Point-DETR, Group R-CNN) to generate pseudo-box labels within a weakly semi-supervised learning framework, enabling efficient self-training of detectors. On the STARHE liver ultrasound dataset, the method achieves a $3.2\times$ speed-up in annotation time and a mean AP@50 improvement of $6.51 \pm 0.98$ over traditional methods at equivalent budgets, with results suggesting that pseudo-labels can rival or surpass fully supervised baselines under certain budgets. Practically, this approach can be implemented on any annotation platform to accelerate integration of deep learning in video-based medical research, reducing expert workload while maintaining or improving detection performance.
Abstract
Purpose: In medical research, deep learning models rely on high-quality annotated data, a process often laborious and timeconsuming. This is particularly true for detection tasks where bounding box annotations are required. The need to adjust two corners makes the process inherently frame-by-frame. Given the scarcity of experts' time, efficient annotation methods suitable for clinicians are needed. Methods: We propose an on-the-fly method for live video annotation to enhance the annotation efficiency. In this approach, a continuous single-point annotation is maintained by keeping the cursor on the object in a live video, mitigating the need for tedious pausing and repetitive navigation inherent in traditional annotation methods. This novel annotation paradigm inherits the point annotation's ability to generate pseudo-labels using a point-to-box teacher model. We empirically evaluate this approach by developing a dataset and comparing on-the-fly annotation time against traditional annotation method. Results: Using our method, annotation speed was 3.2x faster than the traditional annotation technique. We achieved a mean improvement of 6.51 +- 0.98 AP@50 over conventional method at equivalent annotation budgets on the developed dataset. Conclusion: Without bells and whistles, our approach offers a significant speed-up in annotation tasks. It can be easily implemented on any annotation platform to accelerate the integration of deep learning in video-based medical research.
