MedSegDiffNCA: Diffusion Models With Neural Cellular Automata for Skin Lesion Segmentation
Avni Mittal, John Kalkhof, Anirban Mukhopadhyay, Arnav Bhavsar
TL;DR
This paper addresses the heavy parameter burden of diffusion-models in medical image segmentation by introducing NCA-based backbones. It proposes three architectures—Multi-MedSegDiffNCA, CBAM-MedSegDiffNCA, and MultiCBAM-MedSegDiffNCA—coupled with an RGB channel loss to provide semantic guidance during diffusion. The results show that MultiCBAM-MedSegDiffNCA achieves state-of-the-art-like performance on skin lesion segmentation with Dice about $87.84\%$ and IoU about $78.86\%$, while using only about $4.12\times10^{5}$ parameters, i.e., 60–110× fewer parameters than corresponding UNet-based diffusion models. This yields a highly efficient, scalable solution suitable for low-resource clinical environments, reducing both memory footprint and training time without sacrificing segmentation quality.
Abstract
Denoising Diffusion Models (DDMs) are widely used for high-quality image generation and medical image segmentation but often rely on Unet-based architectures, leading to high computational overhead, especially with high-resolution images. This work proposes three NCA-based improvements for diffusion-based medical image segmentation. First, Multi-MedSegDiffNCA uses a multilevel NCA framework to refine rough noise estimates generated by lower level NCA models. Second, CBAM-MedSegDiffNCA incorporates channel and spatial attention for improved segmentation. Third, MultiCBAM-MedSegDiffNCA combines these methods with a new RGB channel loss for semantic guidance. Evaluations on Lesion segmentation show that MultiCBAM-MedSegDiffNCA matches Unet-based model performance with dice score of 87.84% while using 60-110 times fewer parameters, offering a more efficient solution for low resource medical settings.
