Semi-Supervised Learning guided by the Generalized Bayes Rule under Soft Revision

Stefan Dietrich; Julian Rodemann; Christoph Jansen

Semi-Supervised Learning guided by the Generalized Bayes Rule under Soft Revision

Stefan Dietrich, Julian Rodemann, Christoph Jansen

TL;DR

This work addresses robust pseudo-label selection in semi-supervised learning under epistemic uncertainty by adopting credal sets and a generalized Bayes framework. It introduces the Gamma-Maximin criterion with soft revision via $\alpha$-cuts to systematically hedge against prior misspecification, and implements a logistic-model pipeline using Laplace approximation, BFGS, and COBYLA for computation. The paper provides a formal optimization formulation and demonstrates through simulations and real data that the proposed method performs strongly when labeled data are scarce, often outperforming conventional baselines and PPP variants. The approach offers a principled, conservative mechanism to leverage unlabeled data in practical SSL settings, potentially improving robustness to modeling assumptions.

Abstract

We provide a theoretical and computational investigation of the Gamma-Maximin method with soft revision, which was recently proposed as a robust criterion for pseudo-label selection (PLS) in semi-supervised learning. Opposed to traditional methods for PLS we use credal sets of priors ("generalized Bayes") to represent the epistemic modeling uncertainty. These latter are then updated by the Gamma-Maximin method with soft revision. We eventually select pseudo-labeled data that are most likely in light of the least favorable distribution from the so updated credal set. We formalize the task of finding optimal pseudo-labeled data w.r.t. the Gamma-Maximin method with soft revision as an optimization problem. A concrete implementation for the class of logistic models then allows us to compare the predictive power of the method with competing approaches. It is observed that the Gamma-Maximin method with soft revision can achieve very promising results, especially when the proportion of labeled data is low.

Semi-Supervised Learning guided by the Generalized Bayes Rule under Soft Revision

TL;DR

-cuts to systematically hedge against prior misspecification, and implements a logistic-model pipeline using Laplace approximation, BFGS, and COBYLA for computation. The paper provides a formal optimization formulation and demonstrates through simulations and real data that the proposed method performs strongly when labeled data are scarce, often outperforming conventional baselines and PPP variants. The approach offers a principled, conservative mechanism to leverage unlabeled data in practical SSL settings, potentially improving robustness to modeling assumptions.

Abstract

Paper Structure (8 sections, 8 equations, 1 figure)

This paper contains 8 sections, 8 equations, 1 figure.

Introduction
Semi-Supervised Learning (SSL)
Bayesian Analysis
Uncertainty in BPLS
Related work
Generalizied Bayesian PLS under Soft Revision
Experiments
Discussion

Figures (1)

Figure 1: Experiments on real-world data (left) and simulated data (right).

Theorems & Definitions (1)

definition thmcounterdefinition

Semi-Supervised Learning guided by the Generalized Bayes Rule under Soft Revision

TL;DR

Abstract

Semi-Supervised Learning guided by the Generalized Bayes Rule under Soft Revision

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (1)