Deep Active Learning: A Reality Check
Edrina Gashi, Jiankang Deng, Ismail Elezi
TL;DR
This paper conducts a thorough, fair empirical evaluation of state-of-the-art deep active learning methods under uniform settings. It shows that entropy-based sampling typically matches or outperforms recent deep AL methods, and that some methods underperform random sampling in general settings. It reveals how factors such as the starting budget, per-cycle budget, and pretraining materially affect results, and extends the analysis to semi-supervised learning integration and object detection. The findings yield concrete recommendations and highlight the need for rigorous evaluation practices to guide real-world annotation budgeting.
Abstract
We conduct a comprehensive evaluation of state-of-the-art deep active learning methods. Surprisingly, under general settings, no single-model method decisively outperforms entropy-based active learning, and some even fall short of random sampling. We delve into overlooked aspects like starting budget, budget step, and pretraining's impact, revealing their significance in achieving superior results. Additionally, we extend our evaluation to other tasks, exploring the active learning effectiveness in combination with semi-supervised learning, and object detection. Our experiments provide valuable insights and concrete recommendations for future active learning studies. By uncovering the limitations of current methods and understanding the impact of different experimental settings, we aim to inspire more efficient training of deep learning models in real-world scenarios with limited annotation budgets. This work contributes to advancing active learning's efficacy in deep learning and empowers researchers to make informed decisions when applying active learning to their tasks.
