Direct Acquisition Optimization for Low-Budget Active Learning
Zhuokai Zhao, Yibo Jiang, Yuxin Chen
TL;DR
This work tackles the problem of active learning under ultra-low labeling budgets by introducing Direct Acquisition Optimization (DAO), which prioritizes samples based on expected true loss reduction while avoiding costly retraining and large validation sets. DAO combines influence-function-based model parameter updates, surrogate-based label approximation, and bias-corrected loss estimation (LURE) to efficiently estimate the impact of acquiring new data. Across seven benchmarks, DAO consistently outperforms state-of-the-art AL methods, with particularly strong gains in extreme low-budget scenarios such as SVHN with $B=10$. The approach offers a practical pathway to data-efficient learning in domains where labeling is expensive, and it opens avenues for integration with unsupervised and semi-supervised techniques to further reduce labeling needs.
Abstract
Active Learning (AL) has gained prominence in integrating data-intensive machine learning (ML) models into domains with limited labeled data. However, its effectiveness diminishes significantly when the labeling budget is low. In this paper, we first empirically observe the performance degradation of existing AL algorithms in the low-budget settings, and then introduce Direct Acquisition Optimization (DAO), a novel AL algorithm that optimizes sample selections based on expected true loss reduction. Specifically, DAO utilizes influence functions to update model parameters and incorporates an additional acquisition strategy to mitigate bias in loss estimation. This approach facilitates a more accurate estimation of the overall error reduction, without extensive computations or reliance on labeled data. Experiments demonstrate DAO's effectiveness in low budget settings, outperforming state-of-the-arts approaches across seven benchmarks.
