Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

Cédric Jung; Shirin Salehi; Anke Schmeink

Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

Cédric Jung, Shirin Salehi, Anke Schmeink

TL;DR

This work tackles data and energy efficiency in active learning by evaluating state-of-the-art acquisition functions and introducing six aggregation structures to address the exploration-exploitation trade-off. The series, parallel, hybrid, adaptive feedback, annealing, and random exploration schemes are studied with empirical results across CIFAR and PTB-XL, showing that carefully designed aggregations can reduce labeling and computation while maintaining or improving accuracy. Notably, Series $K$-Centers followed by BALD achieves up to a 12% reduction in labeled samples and nearly a 50% decrease in acquisition cost, while alternating strategies like BALD-BADGE offer robust gains across domains. The findings highlight the potential of energy-aware AL and provide practical guidelines for combining acquisition functions to balance accuracy and sustainability in AI.

Abstract

Active learning (AL) is a machine learning (ML) approach that strategically selects the most informative samples for annotation during training, aiming to minimize annotation costs. This strategy not only reduces labeling expenses but also results in energy savings during neural network training, thereby enhancing both data and energy efficiency. In this paper, we implement and evaluate various state-of-the-art acquisition functions, analyzing their accuracy and computational costs, while discussing the advantages and disadvantages of each method. Our findings reveal that representativity-based acquisition functions effectively explore the dataset but do not prioritize boundary decisions, whereas uncertainty-based acquisition functions focus on refining boundary decisions already identified by the neural network. This trade-off is known as the exploration-exploitation dilemma. To address this dilemma, we introduce six aggregation structures: series, parallel, hybrid, adaptive feedback, random exploration, and annealing exploration. Our aggregated acquisition functions alleviate common AL pathologies such as batch mode inefficiency and the cold start problem. Additionally, we focus on balancing accuracy and energy consumption, contributing to the development of more sustainable, energy-aware artificial intelligence (AI). We evaluate our proposed structures on various models and datasets. Our results demonstrate the potential of these structures to reduce computational costs while maintaining or even improving accuracy. Innovative aggregation approaches, such as alternating between acquisition functions such as BALD and BADGE, have shown robust results. Sequentially running functions like $K$-Centers followed by BALD has achieved the same performance goals with up to 12\% fewer samples, while reducing the acquisition cost by almost half.

Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

TL;DR

-Centers followed by BALD achieves up to a 12% reduction in labeled samples and nearly a 50% decrease in acquisition cost, while alternating strategies like BALD-BADGE offer robust gains across domains. The findings highlight the potential of energy-aware AL and provide practical guidelines for combining acquisition functions to balance accuracy and sustainability in AI.

Abstract

-Centers followed by BALD has achieved the same performance goals with up to 12\% fewer samples, while reducing the acquisition cost by almost half.

Paper Structure (39 sections, 25 equations, 13 figures, 2 tables, 2 algorithms)

This paper contains 39 sections, 25 equations, 13 figures, 2 tables, 2 algorithms.

Introduction
Active Learning
Problem settings
Batch mode active learning (BMAL)
Aquisition functions: an overview
Uncertainty sampling
BALD
Greedy $K$-Centers
BADGE
Submodularity-based acquisition function
Facility location
Disparity min
Related works
Proposed structures for aggregated acquisition functions
Parallel and parallel-ranked structures
...and 24 more sections

Figures (13)

Figure 1: Illustration of selected samples and current decision boundary of grid toy dataset at acquisition step 20 with a budget $b = 10$ using a simple fully connected network (FCN) with the following structure: dense-relu-dropout-dense-relu-dense-softmax.
Figure 2: Schema of parallel-ranked combination of two acquisition functions.
Figure 3: Schema of the parallel combination of two acquisition functions. By "Concat" we refer to the concatenation of the elements of the selected batches into a single batch.
Figure 4: Annealing exploration structure
Figure 5: Heatmap illustrating the pairwise comparison of winning rates on CIFAR10 VGG of baseline acquisition functions. The last column represents the row average (the higher the better).
...and 8 more figures

Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

TL;DR

Abstract

Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (13)