Image Classification with Deep Reinforcement Active Learning
Mingyuan Jiu, Xuguang Song, Hichem Sahbi, Shupan Li, Yan Chen, Wei Guo, Lihua Guo, Mingliang Xu
TL;DR
This work tackles label-efficient image classification by introducing DRAL, a deep reinforcement learning based active learning framework that casts sample selection as a Markov decision process. It combines a margin-based pre-ranking with a learned actor–critic policy trained via Deep Deterministic Policy Gradient to adaptively select samples for labeling according to oracle feedback. DRAL demonstrates superior performance over traditional handcrafted active learning strategies on CIFAR-10, SVHN, and Fashion-MNIST, using ResNet-18 as the downstream learner and a budget-aware labeling process. The approach offers a practical and scalable path to maintain high accuracy with limited labeled data in dynamic learning environments.
Abstract
Deep learning is currently reaching outstanding performances on different tasks, including image classification, especially when using large neural networks. The success of these models is tributary to the availability of large collections of labeled training data. In many real-world scenarios, labeled data are scarce, and their hand-labeling is time, effort and cost demanding. Active learning is an alternative paradigm that mitigates the effort in hand-labeling data, where only a small fraction is iteratively selected from a large pool of unlabeled data, and annotated by an expert (a.k.a oracle), and eventually used to update the learning models. However, existing active learning solutions are dependent on handcrafted strategies that may fail in highly variable learning environments (datasets, scenarios, etc). In this work, we devise an adaptive active learning method based on Markov Decision Process (MDP). Our framework leverages deep reinforcement learning and active learning together with a Deep Deterministic Policy Gradient (DDPG) in order to dynamically adapt sample selection strategies to the oracle's feedback and the learning environment. Extensive experiments conducted on three different image classification benchmarks show superior performances against several existing active learning strategies.
