Unsupervised Active Learning via Natural Feature Progressive Framework

Yuxi Liu; Catherine Lalman; Yimin Yang

Unsupervised Active Learning via Natural Feature Progressive Framework

Yuxi Liu, Catherine Lalman, Yimin Yang

TL;DR

The paper tackles the data labeling bottleneck in deep learning by proposing Unsupervised Active Learning via the Natural Feature Progressive Framework (NFPF). It combines a Reconstruction Difference–based seed initialization with a lightweight Specific Feature Learning Machine to measure sample learnability through inter-model discrepancy, enabling a progressive, one-shot subset selection without backpropagation across rounds. Empirical results across nine datasets show NFPF outperforms existing UAL methods and approaches supervised AL on vision tasks, with substantial reductions in labeling and training steps (notably 7x–20x fewer steps on CIFAR-100). Ablation studies and visualizations corroborate robustness, distribution coverage, and informative sampling, highlighting NFPF’s practical impact for cost-efficient large-scale learning.

Abstract

The effectiveness of modern deep learning models is predicated on the availability of large-scale, human-annotated datasets, a process that is notoriously expensive and time-consuming. While Active Learning (AL) offers a strategic solution by labeling only the most informative and representative data, its iterative nature still necessitates significant human involvement. Unsupervised Active Learning (UAL) presents an alternative by shifting the annotation burden to a single, post-selection step. Unfortunately, prevailing UAL methods struggle to achieve state-of-the-art performance. These approaches typically rely on local, gradient-based scoring for sample importance estimation, which not only makes them vulnerable to ambiguous and noisy data but also hinders their capacity to select samples that adequately represent the full data distribution. Moreover, their use of shallow, one-shot linear selection falls short of a true UAL paradigm. In this paper, we propose the Natural Feature Progressive Framework (NFPF), a UAL method that revolutionizes how sample importance is measured. At its core, NFPF employs a Specific Feature Learning Machine (SFLM) to effectively quantify each sample's contribution to model performance. We further utilize the SFLM to define a powerful Reconstruction Difference metric for initial sample selection. Our comprehensive experiments show that NFPF significantly outperforms all established UAL methods and achieves performance on par with supervised AL methods on vision datasets. Detailed ablation studies and qualitative visualizations provide compelling evidence for NFPF's superior performance, enhanced robustness, and improved data distribution coverage.

Unsupervised Active Learning via Natural Feature Progressive Framework

TL;DR

Abstract

Unsupervised Active Learning via Natural Feature Progressive Framework

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)