IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework under Limited Annotation Scheme
Dinh Dai Quan Tran, Hoang-Thien Nguyen, Thanh-Huy Nguyen, Gia-Van To, Tien-Huy Nguyen, Quan Nguyen
TL;DR
IGL-DT addresses semi-supervised semantic segmentation with limited annotations by introducing a dual-teacher framework that fuses global context from SwinUnet with local detail from ResUnet, guided by a Discrepancy Learning mechanism to prevent over-reliance on a single teacher. The student learns through two complementary objectives, Global Context Learning and Local Regional Learning, under a two-stage process that uses Cross Pseudo Supervision and alternating unlabeled-data states, culminating in $\mathcal{L} = \mathcal{L}_{l} + \mathcal{L}_{u}$. Empirical results on Pascal VOC 2012 and Cityscapes demonstrate state-of-the-art performance across multiple label regimes and are supported by ablations confirming the benefits of combining global/local cues and the discrepancy term. The approach highlights the value of heterogeneous backbones in semi-supervised segmentation and offers a scalable path to robust performance when annotations are scarce.
Abstract
Semi-Supervised Semantic Segmentation (SSSS) aims to improve segmentation accuracy by leveraging a small set of labeled images alongside a larger pool of unlabeled data. Recent advances primarily focus on pseudo-labeling, consistency regularization, and co-training strategies. However, existing methods struggle to balance global semantic representation with fine-grained local feature extraction. To address this challenge, we propose a novel tri-branch semi-supervised segmentation framework incorporating a dual-teacher strategy, named IGL-DT. Our approach employs SwinUnet for high-level semantic guidance through Global Context Learning and ResUnet for detailed feature refinement via Local Regional Learning. Additionally, a Discrepancy Learning mechanism mitigates over-reliance on a single teacher, promoting adaptive feature learning. Extensive experiments on benchmark datasets demonstrate that our method outperforms state-of-the-art approaches, achieving superior segmentation performance across various data regimes.
