Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge
Jiancheng Yang, Rui Shi, Liang Jin, Xiaoyang Huang, Kaiming Kuang, Donglai Wei, Shixuan Gu, Jianying Liu, Pengfei Liu, Zhizhong Chai, Yongjie Xiao, Hao Chen, Liming Xu, Bang Du, Xiangyi Yan, Hao Tang, Adam Alessio, Gregory Holste, Jiapeng Zhang, Xiaoming Wang, Jianye He, Lixuan Che, Hanspeter Pfister, Ming Li, Bingbing Ni
TL;DR
The paper presents the RibFrac Challenge, the first large-scale, publicly available benchmark for 3D rib fracture detection and diagnosis from CT, with over 5,000 fractures in 660 scans and voxel-level masks across four fracture classes. It introduces FracNet+—an internal baseline that fuses point-based rib segmentation with voxel-based fracture segmentation, enhanced by large-scale pretrained models—achieving competitive detection performance and illustrating the value of rib segmentation. Across detection and classification tracks, the study shows that AI methods can approach or surpass human detection performance, yet classification remains clinically challenging due to diagnostic ambiguity, class imbalance, and geometric complexity. The authors provide extensive analyses, post-challenge follow-ups, and internal experiments, highlighting the role of segmentation, segmentation-to-detection coupling, and pretrained networks in advancing AI-assisted rib fracture diagnosis. The work sets a foundation for future unified, multicenter approaches that integrate labeling, centerlines, and fracture classification to move toward clinically actionable AI tools.
Abstract
Rib fractures are a common and potentially severe injury that can be challenging and labor-intensive to detect in CT scans. While there have been efforts to address this field, the lack of large-scale annotated datasets and evaluation benchmarks has hindered the development and validation of deep learning algorithms. To address this issue, the RibFrac Challenge was introduced, providing a benchmark dataset of over 5,000 rib fractures from 660 CT scans, with voxel-level instance mask annotations and diagnosis labels for four clinical categories (buckle, nondisplaced, displaced, or segmental). The challenge includes two tracks: a detection (instance segmentation) track evaluated by an FROC-style metric and a classification track evaluated by an F1-style metric. During the MICCAI 2020 challenge period, 243 results were evaluated, and seven teams were invited to participate in the challenge summary. The analysis revealed that several top rib fracture detection solutions achieved performance comparable or even better than human experts. Nevertheless, the current rib fracture classification solutions are hardly clinically applicable, which can be an interesting area in the future. As an active benchmark and research resource, the data and online evaluation of the RibFrac Challenge are available at the challenge website. As an independent contribution, we have also extended our previous internal baseline by incorporating recent advancements in large-scale pretrained networks and point-based rib segmentation techniques. The resulting FracNet+ demonstrates competitive performance in rib fracture detection, which lays a foundation for further research and development in AI-assisted rib fracture detection and diagnosis.
