Exploring Probabilistic Models for Semi-supervised Learning
Jianfeng Wang
TL;DR
The thesis tackles the challenge of semi-supervised learning (SSL) under uncertainty by developing probabilistic SSL methods that provide reliable uncertainty estimates while maintaining competitive accuracy. It introduces a fully Bayesian Generative Bayesian Deep Learning (GBDL) framework for semi-supervised volumetric medical image segmentation, and two Neural Process–based SSL mechanisms: NP-Match for large-scale image classification and NP-SemiSeg for semi-supervised semantic segmentation. A key innovation is the uncertainty-guided skew-geometric Jensen-Shannon divergence $JS^{G_{oldsymbol{\alpha_u}}}$, which improves robustness to noisy pseudo-labels and enhances uncertainty calibration. Together, these methods yield faster and more reliable uncertainty quantification, enabling safer deployment in safety-critical domains such as healthcare and autonomous systems, and demonstrate strong empirical performance on standard SSL benchmarks and medical imaging datasets. The work lays a foundation for broader adoption of probabilistic approaches in SSL and points to future avenues in open-set SSL, translation-equivariant neural processes, and scalable Bayesian SSL architectures.
Abstract
This thesis studies advanced probabilistic models, including both their theoretical foundations and practical applications, for different semi-supervised learning (SSL) tasks. The proposed probabilistic methods are able to improve the safety of AI systems in real applications by providing reliable uncertainty estimates quickly, and at the same time, achieve competitive performance compared to their deterministic counterparts. The experimental results indicate that the methods proposed in the thesis have great value in safety-critical areas, such as the autonomous driving or medical imaging analysis domain, and pave the way for the future discovery of highly effective and efficient probabilistic approaches in the SSL sector.
