An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation
Zijun Gao, Qi Wang, Taiyuan Mei, Xiaohan Cheng, Yun Zi, Haowei Yang
TL;DR
The paper tackles the problem of substantial information loss in SegNet during down-sampling, which degrades semantic segmentation accuracy. It proposes an enhanced encoder–decoder network that uses multiple residual connections to preserve cross-scale details and a balanced cross-entropy loss with a factor $\beta$ to improve convergence under class imbalance. Key contributions include multi-residual feature fusion, a Beta-weighted cross-entropy loss, and demonstrable improvements in mean IoU ($mIoU$) on the PASCAL VOC 2012 dataset compared to SegNet. The work has practical impact by enabling more accurate, scalable AI-driven image analysis with reduced manual inspection across sectors.
Abstract
The traditional SegNet architecture commonly encounters significant information loss during the sampling process, which detrimentally affects its accuracy in image semantic segmentation tasks. To counter this challenge, we introduce an innovative encoder-decoder network structure enhanced with residual connections. Our approach employs a multi-residual connection strategy designed to preserve the intricate details across various image scales more effectively, thus minimizing the information loss inherent to down-sampling procedures. Additionally, to enhance the convergence rate of network training and mitigate sample imbalance issues, we have devised a modified cross-entropy loss function incorporating a balancing factor. This modification optimizes the distribution between positive and negative samples, thus improving the efficiency of model training. Experimental evaluations of our model demonstrate a substantial reduction in information loss and improved accuracy in semantic segmentation. Notably, our proposed network architecture demonstrates a substantial improvement in the finely annotated mean Intersection over Union (mIoU) on the dataset compared to the conventional SegNet. The proposed network structure not only reduces operational costs by decreasing manual inspection needs but also scales up the deployment of AI-driven image analysis across different sectors.
