MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction
Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To
TL;DR
MedConv addresses CT-based bone density prediction under long-tailed class distributions by replacing transformer architectures with a computationally efficient 3D CNN backbone (3D ResNet-50). It couples Balanced Cross-Entropy loss and post-hoc logit adjustment to improve minority-class performance and probability calibration, leveraging high-quality TotalSegmentator segmentation. On the AustinSpine dataset, MedConv achieves up to 21% accuracy and 20% ROC AUC improvements over prior methods, outperforming transformer baselines in accuracy, sensitivity, and specificity. The findings highlight the practical potential for clinical deployment, emphasizing segmentation quality and careful hyperparameter tuning as essential factors for robust, efficient CT-based bone-density assessment.
Abstract
Bone density prediction via CT scans to estimate T-scores is crucial, providing a more precise assessment of bone health compared to traditional methods like X-ray bone density tests, which lack spatial resolution and the ability to detect localized changes. However, CT-based prediction faces two major challenges: the high computational complexity of transformer-based architectures, which limits their deployment in portable and clinical settings, and the imbalanced, long-tailed distribution of real-world hospital data that skews predictions. To address these issues, we introduce MedConv, a convolutional model for bone density prediction that outperforms transformer models with lower computational demands. We also adapt Bal-CE loss and post-hoc logit adjustment to improve class balance. Extensive experiments on our AustinSpine dataset shows that our approach achieves up to 21% improvement in accuracy and 20% in ROC AUC over previous state-of-the-art methods.
