Table of Contents
Fetching ...

Image Classification with Deep Reinforcement Active Learning

Mingyuan Jiu, Xuguang Song, Hichem Sahbi, Shupan Li, Yan Chen, Wei Guo, Lihua Guo, Mingliang Xu

TL;DR

This work tackles label-efficient image classification by introducing DRAL, a deep reinforcement learning based active learning framework that casts sample selection as a Markov decision process. It combines a margin-based pre-ranking with a learned actor–critic policy trained via Deep Deterministic Policy Gradient to adaptively select samples for labeling according to oracle feedback. DRAL demonstrates superior performance over traditional handcrafted active learning strategies on CIFAR-10, SVHN, and Fashion-MNIST, using ResNet-18 as the downstream learner and a budget-aware labeling process. The approach offers a practical and scalable path to maintain high accuracy with limited labeled data in dynamic learning environments.

Abstract

Deep learning is currently reaching outstanding performances on different tasks, including image classification, especially when using large neural networks. The success of these models is tributary to the availability of large collections of labeled training data. In many real-world scenarios, labeled data are scarce, and their hand-labeling is time, effort and cost demanding. Active learning is an alternative paradigm that mitigates the effort in hand-labeling data, where only a small fraction is iteratively selected from a large pool of unlabeled data, and annotated by an expert (a.k.a oracle), and eventually used to update the learning models. However, existing active learning solutions are dependent on handcrafted strategies that may fail in highly variable learning environments (datasets, scenarios, etc). In this work, we devise an adaptive active learning method based on Markov Decision Process (MDP). Our framework leverages deep reinforcement learning and active learning together with a Deep Deterministic Policy Gradient (DDPG) in order to dynamically adapt sample selection strategies to the oracle's feedback and the learning environment. Extensive experiments conducted on three different image classification benchmarks show superior performances against several existing active learning strategies.

Image Classification with Deep Reinforcement Active Learning

TL;DR

This work tackles label-efficient image classification by introducing DRAL, a deep reinforcement learning based active learning framework that casts sample selection as a Markov decision process. It combines a margin-based pre-ranking with a learned actor–critic policy trained via Deep Deterministic Policy Gradient to adaptively select samples for labeling according to oracle feedback. DRAL demonstrates superior performance over traditional handcrafted active learning strategies on CIFAR-10, SVHN, and Fashion-MNIST, using ResNet-18 as the downstream learner and a budget-aware labeling process. The approach offers a practical and scalable path to maintain high accuracy with limited labeled data in dynamic learning environments.

Abstract

Deep learning is currently reaching outstanding performances on different tasks, including image classification, especially when using large neural networks. The success of these models is tributary to the availability of large collections of labeled training data. In many real-world scenarios, labeled data are scarce, and their hand-labeling is time, effort and cost demanding. Active learning is an alternative paradigm that mitigates the effort in hand-labeling data, where only a small fraction is iteratively selected from a large pool of unlabeled data, and annotated by an expert (a.k.a oracle), and eventually used to update the learning models. However, existing active learning solutions are dependent on handcrafted strategies that may fail in highly variable learning environments (datasets, scenarios, etc). In this work, we devise an adaptive active learning method based on Markov Decision Process (MDP). Our framework leverages deep reinforcement learning and active learning together with a Deep Deterministic Policy Gradient (DDPG) in order to dynamically adapt sample selection strategies to the oracle's feedback and the learning environment. Extensive experiments conducted on three different image classification benchmarks show superior performances against several existing active learning strategies.
Paper Structure (15 sections, 7 equations, 2 figures, 3 tables)

This paper contains 15 sections, 7 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The proposed Deep Reinforcement Active Learning framework.
  • Figure 2: Comparison of visualization of the two-dimensional distribution of the selected samples by different methods using t-SNE on the CIFAR-10 dataset. Each row shows the results obtained by one method, each column stands for selected samples in each iteration. In each iteration, the points with the same color have the same label and black points correspond to the selected samples.