Deep Semantic Segmentation of Natural and Medical Images: A Review
Saeid Asgari Taghanaki, Kumar Abhishek, Joseph Paul Cohen, Julien Cohen-Adad, Ghassan Hamarneh
TL;DR
Semantic image segmentation faces data scarcity and class imbalance challenges, especially in medical imaging. The paper surveys six families of deep learning approaches—architectural improvements, data synthesis-based methods, loss-function design, sequenced models, weakly supervised strategies, and multi-task frameworks—and discusses their applicability to natural and medical images, including representative methods such as FCNs, encoder–decoder networks, attention modules, and adversarial training, along with key loss variants like Dice, Focal, and Lovász-Softmax. It also covers GAN-based data augmentation, semi-/weak supervision, and multi-task learning, and outlines future directions such as neural architecture search, multi-modal data fusion, and physics-based data generation to advance robust medical segmentation. The review highlights the need for standardized benchmarks and scalable methods that generalize across diverse modalities and clinical settings, guiding researchers toward practical, high-impact advancements in semantic segmentation.
Abstract
The semantic image segmentation task consists of classifying each pixel of an image into an instance, where each instance corresponds to a class. This task is a part of the concept of scene understanding or better explaining the global context of an image. In the medical image analysis domain, image segmentation can be used for image-guided interventions, radiotherapy, or improved radiological diagnostics. In this review, we categorize the leading deep learning-based medical and non-medical image segmentation solutions into six main groups of deep architectural, data synthesis-based, loss function-based, sequenced models, weakly supervised, and multi-task methods and provide a comprehensive review of the contributions in each of these groups. Further, for each group, we analyze each variant of these groups and discuss the limitations of the current approaches and present potential future research directions for semantic image segmentation.
