P-NOC: adversarial training of CAM generating networks for robust weakly supervised semantic segmentation priors
Lucas David, Helio Pedrini, Zanoni Dias
TL;DR
This work addresses the limitations of CAM-based weakly supervised semantic segmentation by analyzing complementary WSSS techniques and introducing two key innovations: P-NOC, an adversarial training framework that co-evolves CAM-generating and discriminative features, and CCAM-H, which injects weakly supervised saliency hints into a contrastive saliency model. The authors further combine these priors with a refined affinity process and random-walk-based propagation to produce high-quality pseudo-segmentation masks, achieving competitive results on VOC2012 and MS COCO 2014 without strong supervision. The findings demonstrate that leveraging complementary cues and weak saliency information yields robust priors and effective pseudo-masks, significantly narrowing the gap to fully supervised methods. Overall, the approach provides a practical, scalable path to robust WSSS by uniting adversarial CAM training, saliency-aware priors, and affinity-based refinement.
Abstract
Weakly Supervised Semantic Segmentation (WSSS) techniques explore individual regularization strategies to refine Class Activation Maps (CAMs). In this work, we first analyze complementary WSSS techniques in the literature, their segmentation properties, and the conditions in which they are most effective. Based on these findings, we devise two new techniques: P-NOC and CCAM-H. In the first, we promote the conjoint training of two adversarial CAM generating networks: the generator, which progressively learns to erase regions containing class-specific features, and a discriminator, which is refined to gradually shift its attention to new class discriminant features. In the latter, we employ the high quality pseudo-segmentation priors produced by P-NOC to guide the learning to saliency information in a weakly supervised fashion. Finally, we employ both pseudo-segmentation priors and pseudo-saliency proposals in the random walk procedure, resulting in higher quality pseudo-semantic segmentation masks, and competitive results with the state of the art.
