Fast Multi-Organ Fine Segmentation in CT Images with Hierarchical Sparse Sampling and Residual Transformer
Xueqi Guo, Halid Ziya Yerebakan, Yoshihisa Shinagawa, Kritika Iyer, Gerardo Hermosillo Valadez
TL;DR
The paper addresses the challenge of fast, voxel-level multi-organ CT segmentation by introducing a fast fine segmentation framework that combines hierarchical sparse sampling with a Residual Transformer. This approach builds multi-resolution sparse descriptors and decodes them through Transformer-based tokens, enabling full-volume segmentation to be reconstructed from sparse queries on CPU in real-time. Empirical results on internal and public datasets show improved segmentation performance over fast organ classifiers while achieving CPU inference around 2.24 seconds per volume, approaching real-time operation. The method has significant clinical potential for real-time workflows such as scan registration, lesion detection, and landmarking without reliance on GPU acceleration.
Abstract
Multi-organ segmentation of 3D medical images is fundamental with meaningful applications in various clinical automation pipelines. Although deep learning has achieved superior performance, the time and memory consumption of segmenting the entire 3D volume voxel by voxel using neural networks can be huge. Classifiers have been developed as an alternative in cases with certain points of interest, but the trade-off between speed and accuracy remains an issue. Thus, we propose a novel fast multi-organ segmentation framework with the usage of hierarchical sparse sampling and a Residual Transformer. Compared with whole-volume analysis, the hierarchical sparse sampling strategy could successfully reduce computation time while preserving a meaningful hierarchical context utilizing multiple resolution levels. The architecture of the Residual Transformer segmentation network could extract and combine information from different levels of information in the sparse descriptor while maintaining a low computational cost. In an internal data set containing 10,253 CT images and the public dataset TotalSegmentator, the proposed method successfully improved qualitative and quantitative segmentation performance compared to the current fast organ classifier, with fast speed at the level of ~2.24 seconds on CPU hardware. The potential of achieving real-time fine organ segmentation is suggested.
