SACNet: A Spatially Adaptive Convolution Network for 2D Multi-organ Medical Segmentation
Lin Zhang, Wenbo Gao, Jie Yi, Yunyun Yang
TL;DR
This work tackles variability across organ targets and complex backgrounds in multi-organ segmentation by introducing SACNet, an architecture built around the Adaptive Receptive Field Module (ARFM) that leverages Deformable Convolution V3 (DCNv3) and transformer-inspired designs. A WideNet strategy shares DCNv3 weights between encoder and decoder to expand width without increasing depth, improving efficiency. The Continuity Dynamic Adjustment Loss (CTLoss) combines adaptive t-vMF Dice loss with cross-entropy and updates its concentration parameter based on IOU after each epoch, balancing easy and hard classes during training. Evaluated on the Synapse dataset with additional 3D slice assessments on ACDC, SACNet achieves state-of-the-art segmentation performance for multiple abdominal organs and is supported by ablations that confirm the contribution of ARFM, WideNet, and CTLoss to the gains.
Abstract
Multi-organ segmentation in medical image analysis is crucial for diagnosis and treatment planning. However, many factors complicate the task, including variability in different target categories and interference from complex backgrounds. In this paper, we utilize the knowledge of Deformable Convolution V3 (DCNv3) and multi-object segmentation to optimize our Spatially Adaptive Convolution Network (SACNet) in three aspects: feature extraction, model architecture, and loss constraint, simultaneously enhancing the perception of different segmentation targets. Firstly, we propose the Adaptive Receptive Field Module (ARFM), which combines DCNv3 with a series of customized block-level and architecture-level designs similar to transformers. This module can capture the unique features of different organs by adaptively adjusting the receptive field according to various targets. Secondly, we utilize ARFM as building blocks to construct the encoder-decoder of SACNet and partially share parameters between the encoder and decoder, making the network wider rather than deeper. This design achieves a shared lightweight decoder and a more parameter-efficient and effective framework. Lastly, we propose a novel continuity dynamic adjustment loss function, based on t-vMF dice loss and cross-entropy loss, to better balance easy and complex classes in segmentation. Experiments on 3D slice datasets from ACDC and Synapse demonstrate that SACNet delivers superior segmentation performance in multi-organ segmentation tasks compared to several existing methods.
