Adaptive Conformal Guidance for Learning under Uncertainty
Rui Liu, Peng Gao, Yu Shen, Ming Lin, Pratap Tokekar
TL;DR
AdaConG introduces a simple, general framework that embeds split conformal prediction into the training loop to adaptively weight guidance signals according to their uncertainty. By converting CP-derived uncertainty into a weight via $u(x)$ and $w(x)$, the method robustly combines guidance with task-specific losses across supervised, semi-supervised, and imitation-guided RL settings. Empirical results across knowledge distillation, SSL, gridworld navigation, and autonomous driving show substantial gains under imperfect guidance, including up to $+10.89\%$ in KD and over $6\times$ higher rewards in gridworld. The approach is model- and domain-agnostic, offering a practical tool for robust learning under distribution shifts and uncertain supervision.
Abstract
Learning with guidance has proven effective across a wide range of machine learning systems. Guidance may, for example, come from annotated datasets in supervised learning, pseudo-labels in semi-supervised learning, and expert demonstration policies in reinforcement learning. However, guidance signals can be noisy due to domain shifts and limited data availability and may not generalize well. Blindly trusting such signals when they are noisy, incomplete, or misaligned with the target domain can lead to degraded performance. To address these challenges, we propose Adaptive Conformal Guidance (AdaConG), a simple yet effective approach that dynamically modulates the influence of guidance signals based on their associated uncertainty, quantified via split conformal prediction (CP). By adaptively adjusting to guidance uncertainty, AdaConG enables models to reduce reliance on potentially misleading signals and enhance learning performance. We validate AdaConG across diverse tasks, including knowledge distillation, semi-supervised image classification, gridworld navigation, and autonomous driving. Experimental results demonstrate that AdaConG improves performance and robustness under imperfect guidance, e.g., in gridworld navigation, it accelerates convergence and achieves over $6\times$ higher rewards than the best-performing baseline. These results highlight AdaConG as a broadly applicable solution for learning under uncertainty.
