SpineCLUE: Automatic Vertebrae Identification Using Contrastive Learning and Uncertainty Estimation
Sheng Zhang, Minheng Chen, Junxian Wu, Ziyue Zhang, Tonglong Li, Cheng Xue, Youyong Kong
TL;DR
SpineCLUE tackles vertebrae identification from CT scans with arbitrary fields of view by decomposing the problem into localization, segmentation, and identification at the vertebra level. It introduces dual-factor density clustering to robustly locate vertebrae centers, supervised contrastive learning to address inter-class similarity and intra-class variability, and an uncertainty-guided fusion mechanism to refine sequence predictions. The method achieves state-of-the-art ID-rate on VerSe19 and VerSe20 benchmarks and demonstrates strong generalization on an abnormal spine dataset with scoliosis and metal implants. The proposed framework offers a robust, scalable solution for clinical spine analysis under diverse imaging conditions.
Abstract
Vertebrae identification in arbitrary fields-of-view plays a crucial role in diagnosing spine disease. Most spine CT contain only local regions, such as the neck, chest, and abdomen. Therefore, identification should not depend on specific vertebrae or a particular number of vertebrae being visible. Existing methods at the spine-level are unable to meet this challenge. In this paper, we propose a three-stage method to address the challenges in 3D CT vertebrae identification at vertebrae-level. By sequentially performing the tasks of vertebrae localization, segmentation, and identification, the anatomical prior information of the vertebrae is effectively utilized throughout the process. Specifically, we introduce a dual-factor density clustering algorithm to acquire localization information for individual vertebra, thereby facilitating subsequent segmentation and identification processes. In addition, to tackle the issue of interclass similarity and intra-class variability, we pre-train our identification network by using a supervised contrastive learning method. To further optimize the identification results, we estimated the uncertainty of the classification network and utilized the message fusion module to combine the uncertainty scores, while aggregating global information about the spine. Our method achieves state-of-the-art results on the VerSe19 and VerSe20 challenge benchmarks. Additionally, our approach demonstrates outstanding generalization performance on an collected dataset containing a wide range of abnormal cases.
