IPixMatch: Boost Semi-supervised Semantic Segmentation with Inter-Pixel Relation
Kebin Wu, Wenbin Li, Xiaofei Xiao
TL;DR
IPixMatch addresses data scarcity in semi-supervised semantic segmentation by mining inter-pixel relations within soft pseudo-labels. It extends the standard teacher-student framework with an inter-pixel loss that can use either KL divergence or a correlation-based distance, aligning teacher and student predictions over high-confidence regions to exploit spatial dependencies. The method is designed as a drop-in enhancement without requiring architectural changes and shows consistent gains on Pascal VOC 2012 and Cityscapes, especially with limited labeled data. This work highlights the value of contextual inter-pixel relationships for improving generalization in low-data regimes and suggests broad applicability to other tasks with sparse annotations.
Abstract
The scarcity of labeled data in real-world scenarios is a critical bottleneck of deep learning's effectiveness. Semi-supervised semantic segmentation has been a typical solution to achieve a desirable tradeoff between annotation cost and segmentation performance. However, previous approaches, whether based on consistency regularization or self-training, tend to neglect the contextual knowledge embedded within inter-pixel relations. This negligence leads to suboptimal performance and limited generalization. In this paper, we propose a novel approach IPixMatch designed to mine the neglected but valuable Inter-Pixel information for semi-supervised learning. Specifically, IPixMatch is constructed as an extension of the standard teacher-student network, incorporating additional loss terms to capture inter-pixel relations. It shines in low-data regimes by efficiently leveraging the limited labeled data and extracting maximum utility from the available unlabeled data. Furthermore, IPixMatch can be integrated seamlessly into most teacher-student frameworks without the need of model modification or adding additional components. Our straightforward IPixMatch method demonstrates consistent performance improvements across various benchmark datasets under different partitioning protocols.
