Deep Active Learning with Noise Stability
Xingjian Li, Pengkun Yang, Yangcheng Gu, Xueying Zhan, Tianyang Wang, Min Xu, Chengzhong Xu
TL;DR
The paper tackles uncertainty estimation in deep active learning under label scarcity by introducing NoiseStability, a simple, model-internal criterion that measures output deviations when model parameters are perturbed by small noise. It shows that the expected deviation magnitude aligns with the Jacobian norm and that the collected perturbation directions act as randomly projected gradients, preserving key geometric relationships and enabling diverse batch selection via a k-center approach. The method requires no auxiliary models or data-dependent designs and demonstrates competitive or superior performance across image classification, regression, semantic segmentation, and NLP tasks, with supporting theory on gradient equivalence, diversity, and variance reduction. The approach is practical, scalable, and broadly applicable, offering a robust alternative to existing gradient-based and Bayesian Active Learning methods.
Abstract
Uncertainty estimation for unlabeled data is crucial to active learning. With a deep neural network employed as the backbone model, the data selection process is highly challenging due to the potential over-confidence of the model inference. Existing methods resort to special learning fashions (e.g. adversarial) or auxiliary models to address this challenge. This tends to result in complex and inefficient pipelines, which would render the methods impractical. In this work, we propose a novel algorithm that leverages noise stability to estimate data uncertainty. The key idea is to measure the output derivation from the original observation when the model parameters are randomly perturbed by noise. We provide theoretical analyses by leveraging the small Gaussian noise theory and demonstrate that our method favors a subset with large and diverse gradients. Our method is generally applicable in various tasks, including computer vision, natural language processing, and structural data analysis. It achieves competitive performance compared against state-of-the-art active learning baselines.
