Shifting to Machine Supervision: Annotation-Efficient Semi and Self-Supervised Learning for Automatic Medical Image Segmentation and Classification
Pranav Singh, Raviteja Chukkapalli, Shravan Chaudhari, Luoyao Chen, Mei Chen, Jinqian Pan, Craig Smuda, Jacopo Cirrone
TL;DR
The paper tackles annotation bottlenecks in medical imaging by proposing the S4MI pipeline that combines self-supervised and semi-supervised learning for annotation-efficient classification and segmentation. It demonstrates that self-supervised pretraining (notably CASS) surpasses supervised transfer learning for classification, while semi-supervised segmentation with unlabeled data yields superior IoU compared to fully supervised methods using 50% fewer labels. Across three medical imaging datasets, the approach achieves robust performance with reduced labeling effort, supported by open-source code for reproducibility. This work advances practical machine supervision in healthcare by reducing labeling costs while maintaining or improving accuracy.
Abstract
Advancements in clinical treatment are increasingly constrained by the limitations of supervised learning techniques, which depend heavily on large volumes of annotated data. The annotation process is not only costly but also demands substantial time from clinical specialists. Addressing this issue, we introduce the S4MI (Self-Supervision and Semi-Supervision for Medical Imaging) pipeline, a novel approach that leverages advancements in self-supervised and semi-supervised learning. These techniques engage in auxiliary tasks that do not require labeling, thus simplifying the scaling of machine supervision compared to fully-supervised methods. Our study benchmarks these techniques on three distinct medical imaging datasets to evaluate their effectiveness in classification and segmentation tasks. Notably, we observed that self supervised learning significantly surpassed the performance of supervised methods in the classification of all evaluated datasets. Remarkably, the semi-supervised approach demonstrated superior outcomes in segmentation, outperforming fully-supervised methods while using 50% fewer labels across all datasets. In line with our commitment to contributing to the scientific community, we have made the S4MI code openly accessible, allowing for broader application and further development of these methods.
