Querying Easily Flip-flopped Samples for Deep Active Learning

Seong Jin Cho; Gwangsu Kim; Junghyun Lee; Jinwoo Shin; Chang D. Yoo

Querying Easily Flip-flopped Samples for Deep Active Learning

Seong Jin Cho, Gwangsu Kim, Junghyun Lee, Jinwoo Shin, Chang D. Yoo

TL;DR

This work introduces the Least Disagree Metric (LDM) as a theoretically grounded, perturbation-based measure of a sample's proximity to the decision boundary in multiclass deep models, along with an asymptotically consistent estimator L_{N,M}. Building on LDM, the authors propose LDM-S, an active learning method that combines small-LDM sampling with diversity via LDM-Seeding (a k-means++-style seeding using last-layer cosine distance). Empirical evaluations across six OpenML and several image datasets show that LDM-S achieves state-of-the-art performance with competitive runtime, and analyses highlight the importance of batch diversity for robust performance. The work suggests promising future directions for rigorous sample complexity guarantees and scalable posterior-based sampling frameworks.

Abstract

Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data. One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is. The sample's distance to the decision boundary is a natural measure of predictive uncertainty, but it is often intractable to compute, especially for complex decision boundaries formed in multiclass classification tasks. To address this issue, this paper proposes the {\it least disagree metric} (LDM), defined as the smallest probability of disagreement of the predicted label, and an estimator for LDM proven to be asymptotically consistent under mild assumptions. The estimator is computationally efficient and can be easily implemented for deep learning models using parameter perturbation. The LDM-based active learning is performed by querying unlabeled data with the smallest LDM. Experimental results show that our LDM-based active learning algorithm obtains state-of-the-art overall performance on all considered datasets and deep architectures.

Querying Easily Flip-flopped Samples for Deep Active Learning

TL;DR

Abstract

Paper Structure (47 sections, 7 theorems, 40 equations, 18 figures, 4 tables, 2 algorithms)

This paper contains 47 sections, 7 theorems, 40 equations, 18 figures, 4 tables, 2 algorithms.

Introduction
Least Disagree Metric (LDM)
Definition of LDM
An Asymptotically Consistent Estimator of LDM
Empirical Evaluation of LDM
Motivation.
Algorithm Details.
Small $s$ is Sufficient.
LDM-based Active Learning
LDM-Seeding
LDM-S: Active Learning with LDM-Seeding
Experiments
Effectiveness of LDM in Selecting (Batched) Uncertain Samples
Necessity of Pursuing Diversity in LDM-S
Comparing LDM-S to Baseline Algorithms
...and 32 more sections

Key Result

Theorem 1

Let $g \in {\mathcal{H}}$, ${\bm{x}}_0 \in {\mathcal{X}}$, and $\delta > 0$ be arbitrary. Under Assumption assumption:H, assumption:Lipschitz, and assumption:coverage, with $M > \frac{8}{\delta^2}\log(C N)$, we have that for any $\varepsilon \in (0, 1)$, Furthermore, as $\min(M, N) \rightarrow \infty$ withFor the asymptotic analyses, we write $f(n) = \omega(g(n))$ if $\lim_{n \rightarrow \infty}

Figures (18)

Figure 1: An example of LDM of ${\bm{x}}_0$ for given $g$ in binary classification with the linear classifier. Here ${\bm{x}}$ is uniformly distributed on $\mathcal{X} \subset \mathbb{R}^2$. The $h_{\theta}$ disagrees with $g$ for ${\bm{x}}_0$ when $\theta \! < \! \shortminus\pi \! + \! \theta_0$ or $\theta_0 \! < \! \theta$, thus $L(g, {\bm{x}}_0) \! = \! \inf_{h_{\theta} \in \mathcal{H}^{g, {\bm{x}}_0}} \rho (h_{\theta}, g) = \frac{|\theta_0|}{\pi}$.
Figure 2: The comparison of selecting sample(s). The black crosses and circles are labeled, and the gray dots are unlabeled samples. (a) Selected samples by LDM-based, entropy-based, and random sampling in binary classification with the linear classifier. (b) The test accuracy with respect to the number of labeled samples. (c) The t-SNE plot of selected batch samples in 3-class classification with a deep network on MNIST dataset.
Figure 3: The improved test accuracy by labeling the $k$th batch of size $q$ from pool data sorted in ascending order of LDM when the number of labeled samples is $100$ (a) or $300$ (d), and t-SNE plots of the first and eighth batches for each case (b-c, e-f) on MNIST.
Figure 4: The performance comparison across datasets (a) Dolan-Moré plot among the algorithms across all experiments. AUC is the area under the curve. (b) The pairwise penalty matrix over all experiments. Element $P_{i, j}$ corresponds roughly to the number of times algorithm $i$ outperforms algorithm $j$. Column-wise averages at the bottom show overall performance (lower is better).
Figure 5: Examples of negative Spearman's rank correlation between LDM order and uncertainty order on MNIST (a), CIFAR10 (b), SVHN (c), CIFAR100 (d), Tiny ImageNet (e), and FOOD101 (f).
...and 13 more figures

Theorems & Definitions (14)

Definition 1
Theorem 1
Corollary 1
Remark 1
Remark 2
Lemma 1: Theorem 4.2 of wainwright2019highdim
Lemma 2
proof
Proposition 1
proof : Proof of Proposition \ref{['prop:sampling']}
...and 4 more

Querying Easily Flip-flopped Samples for Deep Active Learning

TL;DR

Abstract

Querying Easily Flip-flopped Samples for Deep Active Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (14)