Adaptive Active Learning for Regression via Reinforcement Learning

Simon D. Nguyen; Troy Russo; Kentaro Hoffman; Tyler H. McCormick

Adaptive Active Learning for Regression via Reinforcement Learning

Simon D. Nguyen, Troy Russo, Kentaro Hoffman, Tyler H. McCormick

TL;DR

Experiments show Weighted improved Greedy Sampling (WiGS) outperforms iGS and other baseline methods in both accuracy and labeling efficiency, particularly in domains with irregular data density where the baseline's multiplicative rule ignores high-error samples in dense regions.

Abstract

Active learning for regression reduces labeling costs by selecting the most informative samples. Improved Greedy Sampling is a prominent method that balances feature-space diversity and output-space uncertainty using a static, multiplicative rule. We propose Weighted improved Greedy Sampling (WiGS), which replaces this framework with a dynamic, additive criterion. We formulate weight selection as a reinforcement learning problem, enabling an agent to adapt the exploration-investigation balance throughout learning. Experiments on 18 benchmark datasets and a synthetic environment show WiGS outperforms iGS and other baseline methods in both accuracy and labeling efficiency, particularly in domains with irregular data density where the baseline's multiplicative rule ignores high-error samples in dense regions.

Adaptive Active Learning for Regression via Reinforcement Learning

TL;DR

Abstract

Paper Structure (41 sections, 1 theorem, 14 equations, 21 figures, 5 tables, 1 algorithm)

This paper contains 41 sections, 1 theorem, 14 equations, 21 figures, 5 tables, 1 algorithm.

Introduction
Greedy Sampling
Exploration (GSx) and Investigation (GSy)
Improved Greedy Sampling (iGS)
Weighted Improved Greedy Sampling (WiGS)
The WiGS Framework and Score
Theoretical Analysis: Density Veto
Weighting Strategies: Static and Time-Decay
Adaptive Weighting via Reinforcement Learning
Discretized Adaptation (WiGS-MAB)
Continuous Adaptation (WiGS-SAC)
The MDP Formulation
Optimization
Experimental Setup and Synthetic Data
Data Generating Process (DGP)
...and 26 more sections

Key Result

Proposition 3.1

Consider a candidate pool containing a "target" $x^*$ with high uncertainty $u^*$ and low diversity $d^*$ (high density), and a "distractor" $x'$ with lower uncertainty $u' < u^*$ but moderate diversity $d' > d^*$. As the feature density around $x^*$ increases ($d^* \to 0$), there exists a threshold

Figures (21)

Figure 1: Visualization of the synthetic dataset ($N=1000$). The high-noise region (trap) at $x \approx 0.85$ coincides with high data density.
Figure 2: Performance difference relative to iGS (red line). Values below zero indicate superior performance. The adaptive WiGS-SAC agent and exploration-focused static variants (bottom cluster) consistently outperform the baseline, reducing absolute error by up to 0.05.
Figure 3: Global Performance Heatmap. Values represent the ratio of the Area Under the RMSE Curve (AUC) for each method relative to the iGS baseline. Blue cells ($<1.0$) indicate superior performance (lower cumulative error), while red cells ($>1.0$) indicate inferior performance. Notably, the WiGS methods (bottom rows) demonstrate consistent robustness across diverse domains.
Figure 4: Relative Label Efficiency ($N_{rel}$) aggregated across 20 datasets. The distribution represents the labeling budget required to achieve $70\%$ (blue) and $80\%$ (green) of the total possible performance gain. Values to the left ($<1.0$) indicate superior efficiency. The adaptive WiGS agents (SAC, MAB) not only require fewer labels to reach these milestones than the iGS baseline, but also exhibit narrower variance, highlighting their robust generalization across different datasets.
Figure 5: Visualization of the synthetic datasets. Shows $N=1000$ sample data points (dots) and the true underlying function $f(x)$ (solid line). Shaded regions indicate areas of high noise that overlap with dense data regions from the GMM, creating strategic conflicts between exploration and investigation.
...and 16 more figures

Theorems & Definitions (2)

Proposition 3.1: The Density Veto
proof

Adaptive Active Learning for Regression via Reinforcement Learning

TL;DR

Abstract

Adaptive Active Learning for Regression via Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (2)