Batch Selection and Communication for Active Learning with Edge Labeling

Victor Croisfelt; Shashi Raj Pandey; Osvaldo Simeone; Petar Popovski

Batch Selection and Communication for Active Learning with Edge Labeling

Victor Croisfelt, Shashi Raj Pandey, Osvaldo Simeone, Petar Popovski

TL;DR

This work addresses how to efficiently acquire informative labels from a teacher over a constrained channel in a learner–teacher setup. It integrates Bayesian active learning with a linear mix-up batch-encoding mechanism to form CC-BAKD, a compression-aware active distillation protocol that selects batches based on epistemic uncertainty and transmits compressed batches with soft labels. The approach combines BatchBALD-based batch selection with variational inference for soft-label updates, and uses compression-aware acquisition and two loss schemes to mitigate distortion from transmission, achieving fewer communication rounds with competitive accuracy. Empirical results on MNIST show significant reductions in required communication while maintaining high test accuracy, and demonstrate robustness to quantization noise, highlighting practical benefits for edge learning with limited bandwidth.

Abstract

Conventional retransmission (ARQ) protocols are designed with the goal of ensuring the correct reception of all the individual transmitter's packets at the receiver. When the transmitter is a learner communicating with a teacher, this goal is at odds with the actual aim of the learner, which is that of eliciting the most relevant label information from the teacher. Taking an active learning perspective, this paper addresses the following key protocol design questions: (i) Active batch selection: Which batch of inputs should be sent to the teacher to acquire the most useful information and thus reduce the number of required communication rounds? (ii) Batch encoding: Can batches of data points be combined to reduce the communication resources required at each communication round? Specifically, this work introduces Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD), a novel protocol that integrates Bayesian active learning with compression via a linear mix-up mechanism. Comparisons with existing active learning protocols demonstrate the advantages of the proposed approach.

Batch Selection and Communication for Active Learning with Edge Labeling

TL;DR

Abstract

Paper Structure (19 sections, 19 equations, 4 figures, 2 algorithms)

This paper contains 19 sections, 19 equations, 4 figures, 2 algorithms.

Introduction
Active Batch Selection and Batch Encoding
Main Contributions
Setting
Learning Model
Communication Model
Bayesian Active Knowledge Distillation
Bayesian Active Knowledge Distillation
Bayesian Learning with Soft Labels
Communication-Constrained Bayesian Active Distillation
Batch Encoding
Batch Decoding and Teacher's Feedback
Learner's Model Update
Compression-Aware Active Batch Selection
Experiments
...and 4 more sections

Figures (4)

Figure 1: A learner communicates with a teacher over a constrained communication channel to obtain soft labels for batches of unlabeled inputs. This work aims to devise active batch selection strategies that use the available communication resources as efficiently as possible while reducing the communication cost of a batch through a batch encoding method.
Figure 2: Evolution of the learner's performance as a function of the number of communicated symbols $N$. The batch size is $B=4$ for all schemes. The red lines indicate values of the number of symbols $N$ required to transmit a single uncompressed input and a batch of $B=4$ uncompressed inputs, respectively.
Figure 3: Learner's final test accuracy as a function of the compression ratio, $R$, for a total number of symbols of $N=784$ and a batch size of $B=4$. For , we show the corresponding number of communication rounds, $C$ in \ref{['eq:new-al-steps']} in the top horizontal axis. For and , the number of communication rounds is fixed to one for a batch size of one since these protocols do not apply compression.
Figure 4: final learner's performance as a function of the noise power for a batch size of $B=4$. For with $R=0.99$, the number of transmitted symbols is $N=7840$, while for and , the number of transmitted symbols is $N=78400$.

Batch Selection and Communication for Active Learning with Edge Labeling

TL;DR

Abstract

Batch Selection and Communication for Active Learning with Edge Labeling

Authors

TL;DR

Abstract

Table of Contents

Figures (4)