Implantable Adaptive Cells: A Novel Enhancement for Pre-Trained U-Nets in Medical Image Segmentation
Emil Benedykciuk, Marcin Denkowski, Grzegorz Wójcik
TL;DR
The paper addresses the challenge of upgrading pre-trained medical image segmentation models without full retraining by introducing Implantable Adaptive Cells (IAC) that are embedded into U-Net skip connections. Using a differentiable NAS framework inspired by PC-DARTS, the authors learn compact DAG modules that enhance feature fusion while keeping encoder–decoder weights frozen, then discretize and re-train with the new cells. Experiments on four datasets (ACDC, BraTS, KiTS, AMOS) across multiple backbones show consistent Dice score improvements, with average gains around 6.3 percentage points and up to 11 points in some cases, while reducing overall computational cost compared to full NAS. The findings suggest that targeted NAS within skip connections can effectively refine pre-trained architectures and potentially generalize to other models and modalities, offering a practical path for model enhancement in data- and resource-constrained clinical settings.
Abstract
This paper introduces a novel approach to enhance the performance of pre-trained neural networks in medical image segmentation using gradient-based Neural Architecture Search (NAS) methods. We present the concept of Implantable Adaptive Cell (IAC), small modules identified through Partially-Connected DARTS based approach, designed to be injected into the skip connections of an existing and already trained U-shaped model. Unlike traditional NAS methods, our approach refines existing architectures without full retraining. Experiments on four medical datasets with MRI and CT images show consistent accuracy improvements on various U-Net configurations, with segmentation accuracy gain by approximately 5 percentage points across all validation datasets, with improvements reaching up to 11\%pt in the best-performing cases. The findings of this study not only offer a cost-effective alternative to the complete overhaul of complex models for performance upgrades but also indicate the potential applicability of our method to other architectures and problem domains.
