Learnable Weight Initialization for Volumetric Medical Image Segmentation
Shahina Kunhimon, Abdelrahman Shaker, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan
TL;DR
The paper tackles the challenge of data scarcity and initialization-induced variance in hybrid volumetric medical image segmentation. It proposes a learnable, data-dependent weight initialization learned through a two-stage process: Step 1 self-supervised pretraining using a Transformation Module that performs depth-wise rearrangement, sub-volume partitioning, shuffling, and masking to induce structural and contextual priors; Step 2 supervised segmentation training initialized by Step 1. Experiments on Synapse multi-organ CT and MSD Lung show consistent Dice improvements and strong statistical significance, with state-of-the-art networks benefiting from the data-dependent initialization. The approach yields competitive results with less data and avoids external datasets, offering a practical, architecture-agnostic enhancement for volumetric segmentation.
Abstract
Hybrid volumetric medical image segmentation models, combining the advantages of local convolution and global attention, have recently received considerable attention. While mainly focusing on architectural modifications, most existing hybrid approaches still use conventional data-independent weight initialization schemes which restrict their performance due to ignoring the inherent volumetric nature of the medical data. To address this issue, we propose a learnable weight initialization approach that utilizes the available medical training data to effectively learn the contextual and structural cues via the proposed self-supervised objectives. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach, leading to state-of-the-art segmentation performance. Our proposed data-dependent initialization approach performs favorably as compared to the Swin-UNETR model pretrained using large-scale datasets on multi-organ segmentation task. Our source code and models are available at: https://github.com/ShahinaKK/LWI-VMS.
