BarlowTwins-CXR : Enhancing Chest X-Ray abnormality localization in heterogeneous data with cross-domain self-supervised learning
Haoyue Sheng, Linrui Ma, Jean-Francois Samson, Dianbo Liu
TL;DR
The paper tackles cross-domain domain inconsistency in chest X-ray abnormality localization by introducing BarlowTwins-CXR, a two-phase training strategy that first performs self-supervised pretraining on the NIH-CXR dataset and then fine-tunes on VinDr-CXR using Faster R-CNN with FPN. The approach yields a notable improvement in localization performance, achieving about a $mAP_{50}$ increase of roughly 3 percentage points over ImageNet pretraining and stronger heatmaps that better align with ground-truth lesions, especially in low-data regimes as shown by linear evaluation $AUC$ results. The findings demonstrate that self-supervised pretraining on domain-relevant unlabeled medical images enhances generalizability across heterogeneous CXR data and can reduce radiologist workload by improving automated localization. This method offers practical implications for deploying robust CXR analysis tools in diverse clinical settings with limited labeled data, while acknowledging computational costs and the need for larger bounding-box datasets for broader generalization.
Abstract
Background: Chest X-ray imaging-based abnormality localization, essential in diagnosing various diseases, faces significant clinical challenges due to complex interpretations and the growing workload of radiologists. While recent advances in deep learning offer promising solutions, there is still a critical issue of domain inconsistency in cross-domain transfer learning, which hampers the efficiency and accuracy of diagnostic processes. This study aims to address the domain inconsistency problem and improve autonomic abnormality localization performance of heterogeneous chest X-ray image analysis, by developing a self-supervised learning strategy called "BarlwoTwins-CXR". Methods: We utilized two publicly available datasets: the NIH Chest X-ray Dataset and the VinDr-CXR. The BarlowTwins-CXR approach was conducted in a two-stage training process. Initially, self-supervised pre-training was performed using an adjusted Barlow Twins algorithm on the NIH dataset with a Resnet50 backbone pre-trained on ImageNet. This was followed by supervised fine-tuning on the VinDr-CXR dataset using Faster R-CNN with Feature Pyramid Network (FPN). Results: Our experiments showed a significant improvement in model performance with BarlowTwins-CXR. The approach achieved a 3% increase in mAP50 accuracy compared to traditional ImageNet pre-trained models. In addition, the Ablation CAM method revealed enhanced precision in localizing chest abnormalities. Conclusion: BarlowTwins-CXR significantly enhances the efficiency and accuracy of chest X-ray image-based abnormality localization, outperforming traditional transfer learning methods and effectively overcoming domain inconsistency in cross-domain scenarios. Our experiment results demonstrate the potential of using self-supervised learning to improve the generalizability of models in medical settings with limited amounts of heterogeneous data.
