Robustness to Adversarial Perturbations in Learning from Incomplete Data
Amir Najafi, Shin-ichi Maeda, Masanori Koyama, Takeru Miyato
TL;DR
This work addresses robustness of learning under adversarial distributional shifts when only partial labels are available. It unifies Semi-Supervised Learning and Distributionally Robust Learning into the SSDRL framework, introduces a dual formulation and soft-label self-learning to leverage unlabeled data, and provides generalization guarantees via novel adversarial complexity metrics (SSM Rademacher) and the Minimum Supervision Ratio. The authors prove convergence of a SGD-based optimizer for the semi-supervised objective and demonstrate that SSDRL is competitive with state-of-the-art methods like VAT and Pseudo-Labeling on standard benchmarks. Overall, the paper advances theory and practice for robust learning from incomplete data, with practical algorithms and empirical validation on multiple image datasets.
Abstract
What is the role of unlabeled data in an inference problem, when the presumed underlying distribution is adversarially perturbed? To provide a concrete answer to this question, this paper unifies two major learning frameworks: Semi-Supervised Learning (SSL) and Distributionally Robust Learning (DRL). We develop a generalization theory for our framework based on a number of novel complexity measures, such as an adversarial extension of Rademacher complexity and its semi-supervised analogue. Moreover, our analysis is able to quantify the role of unlabeled data in the generalization under a more general condition compared to the existing theoretical works in SSL. Based on our framework, we also present a hybrid of DRL and EM algorithms that has a guaranteed convergence rate. When implemented with deep neural networks, our method shows a comparable performance to those of the state-of-the-art on a number of real-world benchmark datasets.
