Robust Semi-supervised Learning via $f$-Divergence and $α$-Rényi Divergence
Gholamali Aminian, Amirhossien Bagheri, Mahyar JafariNodeh, Radmehr Karimian, Mohammad-Hossein Yassaee
TL;DR
This work develops a unified, divergence-guided framework for robust semi-supervised learning (SSL) by introducing divergence-based empirical risks (DER) grounded in $f$-divergences and $α$-Rényi divergences. It defines DER for both supervised and SSL settings, and extends it to SSL through pseudo-labeling and entropy-minimization with regularizers that encourage diverse, confident predictions while mitigating confirmation bias. The proposed DP-SSL and DEM-SSL algorithms implement uncertainty-aware pseudo-labeling and entropy-based regularization, demonstrating robustness to noisy pseudo-labels across datasets like CIFAR-100 and Letters. The results highlight that certain divergences (e.g., Jensen-Shannon) can offer superior robustness under label noise, and the framework provides theoretical upper bounds linking SSL costs to fully supervised risk, with potential for integration into popular SSL paradigms.
Abstract
This paper investigates a range of empirical risk functions and regularization methods suitable for self-training methods in semi-supervised learning. These approaches draw inspiration from various divergence measures, such as $f$-divergences and $α$-Rényi divergences. Inspired by the theoretical foundations rooted in divergences, i.e., $f$-divergences and $α$-Rényi divergence, we also provide valuable insights to enhance the understanding of our empirical risk functions and regularization techniques. In the pseudo-labeling and entropy minimization techniques as self-training methods for effective semi-supervised learning, the self-training process has some inherent mismatch between the true label and pseudo-label (noisy pseudo-labels) and some of our empirical risk functions are robust, concerning noisy pseudo-labels. Under some conditions, our empirical risk functions demonstrate better performance when compared to traditional self-training methods.
