Understanding Uncertainty-based Active Learning Under Model Mismatch
Amir Hossein Rahmati, Mingzhou Fan, Ruida Zhou, Nathan M. Urban, Byung-Jun Yoon, Xiaoning Qian
TL;DR
The paper addresses how the efficacy of Uncertainty-based Active Learning (UAL) in regression depends on model capacity. It uses Bayesian learning, BAL, and a bias-variance framework, supported by Bernstein-von Mises theory, to show that variance-based acquisition can reflect true $MSE$ only when the model class can cover the ground-truth function; otherwise, UAL can underperform random sampling. To mitigate model-mismatch drawbacks, it proposes remedies that directly estimate the true objective $MSE$ or bound it, demonstrated through synthetic experiments with Bayesian Polynomial Regression and Gaussian Process Regression as well as real datasets. The findings provide practical guidance for designing robust UAL strategies when the predictive model class is insufficient to capture the underlying target, highlighting the value of objective-aligned acquisition functions. This work lays groundwork for error-aware acquisition design and suggests Kriging-like estimators and $MSE$ upper-bound based approaches as promising directions.
Abstract
Instead of randomly acquiring training data points, Uncertainty-based Active Learning (UAL) operates by querying the label(s) of pivotal samples from an unlabeled pool selected based on the prediction uncertainty, thereby aiming at minimizing the labeling cost for model training. The efficacy of UAL critically depends on the model capacity as well as the adopted uncertainty-based acquisition function. Within the context of this study, our analytical focus is directed toward comprehending how the capacity of the machine learning model may affect UAL efficacy. Through theoretical analysis, comprehensive simulations, and empirical studies, we conclusively demonstrate that UAL can lead to worse performance in comparison with random sampling when the machine learning model class has low capacity and is unable to cover the underlying ground truth. In such situations, adopting acquisition functions that directly target estimating the prediction performance may be beneficial for improving the performance of UAL.
