Table of Contents
Fetching ...

Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer

Seyed M. R. Modaresi, Aomar Osmani, Mohammadreza Razzazi, Abdelghani Chibani

TL;DR

This work tackles the challenge of fixed kernel sizes in medical image segmentation by inserting an adaptive convolution layer ahead of state-of-the-art models like UCTransNet. The layer uses a coefficient generator to assemble per-pixel kernels from a fixed set of Fourier-Bessel bases, enabling dynamic receptive fields with a modest parameter increase. Empirical results on SegPC2021 and ISIC2018 show consistent improvements in Accuracy, Dice, and IoU across multiple architectures, underscoring the method's robustness to diverse anatomical structures and textures. The approach offers a practical path to enhanced segmentation performance with minimal architectural disruption, facilitating broader adoption in clinical imaging pipelines.

Abstract

Medical image segmentation plays a vital role in various clinical applications, enabling accurate delineation and analysis of anatomical structures or pathological regions. Traditional CNNs have achieved remarkable success in this field. However, they often rely on fixed kernel sizes, which can limit their performance and adaptability in medical images where features exhibit diverse scales and configurations due to variability in equipment, target sizes, and expert interpretations. In this paper, we propose an adaptive layer placed ahead of leading deep-learning models such as UCTransNet, which dynamically adjusts the kernel size based on the local context of the input image. By adaptively capturing and fusing features at multiple scales, our approach enhances the network's ability to handle diverse anatomical structures and subtle image details, even for recently performing architectures that internally implement intra-scale modules, such as UCTransnet. Extensive experiments are conducted on benchmark medical image datasets to evaluate the effectiveness of our proposal. It consistently outperforms traditional \glspl{CNN} with fixed kernel sizes with a similar number of parameters, achieving superior segmentation Accuracy, Dice, and IoU in popular datasets such as SegPC2021 and ISIC2018. The model and data are published in the open-source repository, ensuring transparency and reproducibility of our promising results.

Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer

TL;DR

This work tackles the challenge of fixed kernel sizes in medical image segmentation by inserting an adaptive convolution layer ahead of state-of-the-art models like UCTransNet. The layer uses a coefficient generator to assemble per-pixel kernels from a fixed set of Fourier-Bessel bases, enabling dynamic receptive fields with a modest parameter increase. Empirical results on SegPC2021 and ISIC2018 show consistent improvements in Accuracy, Dice, and IoU across multiple architectures, underscoring the method's robustness to diverse anatomical structures and textures. The approach offers a practical path to enhanced segmentation performance with minimal architectural disruption, facilitating broader adoption in clinical imaging pipelines.

Abstract

Medical image segmentation plays a vital role in various clinical applications, enabling accurate delineation and analysis of anatomical structures or pathological regions. Traditional CNNs have achieved remarkable success in this field. However, they often rely on fixed kernel sizes, which can limit their performance and adaptability in medical images where features exhibit diverse scales and configurations due to variability in equipment, target sizes, and expert interpretations. In this paper, we propose an adaptive layer placed ahead of leading deep-learning models such as UCTransNet, which dynamically adjusts the kernel size based on the local context of the input image. By adaptively capturing and fusing features at multiple scales, our approach enhances the network's ability to handle diverse anatomical structures and subtle image details, even for recently performing architectures that internally implement intra-scale modules, such as UCTransnet. Extensive experiments are conducted on benchmark medical image datasets to evaluate the effectiveness of our proposal. It consistently outperforms traditional \glspl{CNN} with fixed kernel sizes with a similar number of parameters, achieving superior segmentation Accuracy, Dice, and IoU in popular datasets such as SegPC2021 and ISIC2018. The model and data are published in the open-source repository, ensuring transparency and reproducibility of our promising results.
Paper Structure (14 sections, 5 figures, 3 tables)

This paper contains 14 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Illustration of diverse scales observed in medical images sourced from the SegPC-2021 dataset. The green contours highlight the regions containing cancerous lesions.
  • Figure 2: Adaptive Convolution Layer added to the leading UCTransNet architecture. The coefficient generator network generates the weights for Fourier-Bessel bases with different sizes for each pixel and channel. It results a fixed kernel to be convolved for that pixel.
  • Figure 3: Visual comparisons of different methods for cytoplasm segmentation (depicted as the white region) on the SegPC 2021 dataset. The blue region denotes the Nucleus area of a cell. The initial column displays the input image, while the second column presents the ground truth. Following these, the subsequent columns feature the models along with their adaptive versions. As is evident, models incorporating the adaptive layer more accurately recognize the shape of the cytoplasm, and this improvement is particularly greater in larger segments.
  • Figure 4: Segmentation output of various deep model in ISIC 2018 dataset. The white region represents the ground truth that remains undetected (), while the gray region represents the detected ground truth (), and red denotes the . The columns orders are similar to \ref{['fig:comapre images']}. Once again, our model is more effective in identifying target regions, particularly noticeable in larger ones where traditional models with fixed kernels face difficulties in detecting intra-size features.
  • Figure 5: The train loss and validation loss of SegPC 2021 dataset (left) and ISIC 2018 dataset (right). They indicate that the models are neither overfitting nor underfitting.