MC-PanDA: Mask Confidence for Panoptic Domain Adaptation
Ivan Martinović, Josip Šarić, Siniša Šegvić
TL;DR
MC-PanDA tackles domain-adaptive panoptic segmentation by exploiting mask transformers' uncertainty estimates through two mechanisms: Mask-wide Loss Scaling (MLS) and Confidence-based Point Filtering (CBPF). These components downweight and selectively sample learning signals from target-domain pseudo-labels, mitigating noise amplification in Mean-Teacher self-training. The method achieves a new state-of-the-art on Synthia→Cityscapes (47.4 $PQ_{16}$, +6.2pp) and shows strong gains across other synthetic-to-real benchmarks, with ablations validating the complementary contribution of MLS and CBPF. This region-aware uncertainty approach advances practical panoptic domain adaptation, offering improved robustness to domain shift and promising avenues for autonomous scene understanding in unlabeled target domains.
Abstract
Domain adaptive panoptic segmentation promises to resolve the long tail of corner cases in natural scene understanding. Previous state of the art addresses this problem with cross-task consistency, careful system-level optimization and heuristic improvement of teacher predictions. In contrast, we propose to build upon remarkable capability of mask transformers to estimate their own prediction uncertainty. Our method avoids noise amplification by leveraging fine-grained confidence of panoptic teacher predictions. In particular, we modulate the loss with mask-wide confidence and discourage back-propagation in pixels with uncertain teacher or confident student. Experimental evaluation on standard benchmarks reveals a substantial contribution of the proposed selection techniques. We report 47.4 PQ on Synthia to Cityscapes, which corresponds to an improvement of 6.2 percentage points over the state of the art. The source code is available at https://github.com/helen1c/MC-PanDA.
