Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

Denis Huseljic; Paul Hahn; Marek Herde; Lukas Rauch; Bernhard Sick

Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

Denis Huseljic, Paul Hahn, Marek Herde, Lukas Rauch, Bernhard Sick

TL;DR

The paper tackles BAIT’s substantial time and memory demands caused by Fisher Information Matrix computations in deep active learning. It introduces two approximations—BAIT Exp, which truncates the expectation to the top-$c$ predicted classes, and BAIT Binary, which recasts the problem as binary via a Bernoulli-like likelihood using $\hat{p}=\max_y p_{\boldsymbol{\theta}}(y|x)$—to achieve substantial computational savings: $O(c (K D)^2)$ time and $O(M D c K)$ space for Exp, and $O(D^2)$ time and $O(M D)$ space for Binary. Extensive experiments across nine image datasets, including ImageNet, show that these methods closely match or surpass the original BAIT and outperform other state-of-the-art AL strategies, with BAIT Binary delivering the strongest gains on image tasks. An open-source toolbox accompanies the work to facilitate adoption and replication in future research.

Abstract

Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks. BAIT, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets. However, BAIT's high computational and memory requirements hinder its applicability on large-scale classification tasks, resulting in current research neglecting BAIT in their evaluation. This paper introduces two methods to enhance BAIT's computational efficiency and scalability. Notably, we significantly reduce its time complexity by approximating the Fisher Information. In particular, we adapt the original formulation by i) taking the expectation over the most probable classes, and ii) constructing a binary classification task, leading to an alternative likelihood for gradient computations. Consequently, this allows the efficient use of BAIT on large-scale datasets, including ImageNet. Our unified and comprehensive evaluation across a variety of datasets demonstrates that our approximations achieve strong performance with considerably reduced time complexity. Furthermore, we provide an extensive open-source toolbox that implements recent state-of-the-art AL strategies, available at https://github.com/dhuseljic/dal-toolbox.

Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

TL;DR

predicted classes, and BAIT Binary, which recasts the problem as binary via a Bernoulli-like likelihood using

—to achieve substantial computational savings:

time and

space for Exp, and

time and

space for Binary. Extensive experiments across nine image datasets, including ImageNet, show that these methods closely match or surpass the original BAIT and outperform other state-of-the-art AL strategies, with BAIT Binary delivering the strongest gains on image tasks. An open-source toolbox accompanies the work to facilitate adoption and replication in future research.

Abstract

Paper Structure (17 sections, 7 equations, 6 figures, 6 tables)

This paper contains 17 sections, 7 equations, 6 figures, 6 tables.

Introduction
Related Work
Notation
Time and Space Complexity of Bait
Approximations
Expectation
Gradient
Experimental Results
Setup
Assessment of Approximations
Benchmark Experiments
Conclusion
Experimental Setup
Dataset Description
Diagonal Approximation of the Fisher Information Matrix
...and 2 more sections

Figures (6)

Figure 1: Comparison of different AL strategies on CIFAR-10.
Figure 2: Accuracy improvement curves of Bait and its approximations.
Figure 3: Accuracy improvement curves of state-of-the-art strategies.
Figure 4: Accuracy improvement curves of state-of-the-art strategies.
Figure 5: Accuracy improvement curves of state-of-the-art strategies.
...and 1 more figures

Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

TL;DR

Abstract

Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (6)