Table of Contents
Fetching ...

One-shot Active Learning Based on Lewis Weight Sampling for Multiple Deep Models

Sheng-Jun Huang, Yi Li, Yiming Sun, Ying-Peng Tang

TL;DR

This paper proposes a one-shot AL method, which performs all label queries without repeated model training, and achieves competitive performances with the state-of-the-art AL methods for multiple target models.

Abstract

Active learning (AL) for multiple target models aims to reduce labeled data querying while effectively training multiple models concurrently. Existing AL algorithms often rely on iterative model training, which can be computationally expensive, particularly for deep models. In this paper, we propose a one-shot AL method to address this challenge, which performs all label queries without repeated model training. Specifically, we extract different representations of the same dataset using distinct network backbones, and actively learn the linear prediction layer on each representation via an $\ell_p$-regression formulation. The regression problems are solved approximately by sampling and reweighting the unlabeled instances based on their maximum Lewis weights across the representations. An upper bound on the number of samples needed is provided with a rigorous analysis for $p\in [1, +\infty)$. Experimental results on 11 benchmarks show that our one-shot approach achieves competitive performances with the state-of-the-art AL methods for multiple target models.

One-shot Active Learning Based on Lewis Weight Sampling for Multiple Deep Models

TL;DR

This paper proposes a one-shot AL method, which performs all label queries without repeated model training, and achieves competitive performances with the state-of-the-art AL methods for multiple target models.

Abstract

Active learning (AL) for multiple target models aims to reduce labeled data querying while effectively training multiple models concurrently. Existing AL algorithms often rely on iterative model training, which can be computationally expensive, particularly for deep models. In this paper, we propose a one-shot AL method to address this challenge, which performs all label queries without repeated model training. Specifically, we extract different representations of the same dataset using distinct network backbones, and actively learn the linear prediction layer on each representation via an -regression formulation. The regression problems are solved approximately by sampling and reweighting the unlabeled instances based on their maximum Lewis weights across the representations. An upper bound on the number of samples needed is provided with a rigorous analysis for . Experimental results on 11 benchmarks show that our one-shot approach achieves competitive performances with the state-of-the-art AL methods for multiple target models.
Paper Structure (26 sections, 11 theorems, 53 equations, 4 figures, 6 tables, 1 algorithm)

This paper contains 26 sections, 11 theorems, 53 equations, 4 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1.1

Let $w_1(A^j),\dots,w_n(A^j)$ denote the Lewis weights of $A^j$ and $T = \sum_{i=1}^{n} \max_{j\in [k]} w_i(A^j)$. Suppose that $T = \mathop{\mathrm{poly}}\nolimits(d)$. There exists a randomized algorithm which samples unlabeled instances and outputs solutions $\Tilde{\bm{\theta}}^1, \dots,\ \Tilde{\bm{\theta}}^k \in \mathbb{R}^d$ such that eq:probobj holds for $p\geq 1$ and all $j\in[k]$ with p

Figures (4)

  • Figure 1: The trends of the sum of the maximum Lewis weights with $p=2$ among multiple representations as the number of deep models increases.
  • Figure 2: Results of Performance comparison in classification datasets. The error bars indicate the standard deviation of the performances of multiple models.
  • Figure 3: Results of performance comparisons in regression datasets with different query budgets.
  • Figure 4: The mean percentage of shared data between instances having the highest maximum leverage score and those having the highest leverage score under the representation of a specific deep model.

Theorems & Definitions (19)

  • Theorem 1.1: Informal version of Corollary \ref{['crl:main']}
  • Definition 3.1: $\ell_p$ Lewis Weights
  • Definition 3.2: $\ell_p$ Subspace Embedding
  • Definition 3.3: Sampling Matrix
  • Lemma 3.4: Constant-factor Subspace Embedding, CP2015
  • Lemma 3.5: Lemmata 5.3 and 5.4 of CP2015
  • Theorem 3.6
  • Corollary 3.7
  • proof
  • Lemma 3.9
  • ...and 9 more