Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification
Denis Huseljic, Paul Hahn, Marek Herde, Lukas Rauch, Bernhard Sick
TL;DR
The paper tackles BAIT’s substantial time and memory demands caused by Fisher Information Matrix computations in deep active learning. It introduces two approximations—BAIT Exp, which truncates the expectation to the top-$c$ predicted classes, and BAIT Binary, which recasts the problem as binary via a Bernoulli-like likelihood using $\hat{p}=\max_y p_{\boldsymbol{\theta}}(y|x)$—to achieve substantial computational savings: $O(c (K D)^2)$ time and $O(M D c K)$ space for Exp, and $O(D^2)$ time and $O(M D)$ space for Binary. Extensive experiments across nine image datasets, including ImageNet, show that these methods closely match or surpass the original BAIT and outperform other state-of-the-art AL strategies, with BAIT Binary delivering the strongest gains on image tasks. An open-source toolbox accompanies the work to facilitate adoption and replication in future research.
Abstract
Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks. BAIT, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets. However, BAIT's high computational and memory requirements hinder its applicability on large-scale classification tasks, resulting in current research neglecting BAIT in their evaluation. This paper introduces two methods to enhance BAIT's computational efficiency and scalability. Notably, we significantly reduce its time complexity by approximating the Fisher Information. In particular, we adapt the original formulation by i) taking the expectation over the most probable classes, and ii) constructing a binary classification task, leading to an alternative likelihood for gradient computations. Consequently, this allows the efficient use of BAIT on large-scale datasets, including ImageNet. Our unified and comprehensive evaluation across a variety of datasets demonstrates that our approximations achieve strong performance with considerably reduced time complexity. Furthermore, we provide an extensive open-source toolbox that implements recent state-of-the-art AL strategies, available at https://github.com/dhuseljic/dal-toolbox.
