Batch Selection and Communication for Active Learning with Edge Labeling
Victor Croisfelt, Shashi Raj Pandey, Osvaldo Simeone, Petar Popovski
TL;DR
This work addresses how to efficiently acquire informative labels from a teacher over a constrained channel in a learner–teacher setup. It integrates Bayesian active learning with a linear mix-up batch-encoding mechanism to form CC-BAKD, a compression-aware active distillation protocol that selects batches based on epistemic uncertainty and transmits compressed batches with soft labels. The approach combines BatchBALD-based batch selection with variational inference for soft-label updates, and uses compression-aware acquisition and two loss schemes to mitigate distortion from transmission, achieving fewer communication rounds with competitive accuracy. Empirical results on MNIST show significant reductions in required communication while maintaining high test accuracy, and demonstrate robustness to quantization noise, highlighting practical benefits for edge learning with limited bandwidth.
Abstract
Conventional retransmission (ARQ) protocols are designed with the goal of ensuring the correct reception of all the individual transmitter's packets at the receiver. When the transmitter is a learner communicating with a teacher, this goal is at odds with the actual aim of the learner, which is that of eliciting the most relevant label information from the teacher. Taking an active learning perspective, this paper addresses the following key protocol design questions: (i) Active batch selection: Which batch of inputs should be sent to the teacher to acquire the most useful information and thus reduce the number of required communication rounds? (ii) Batch encoding: Can batches of data points be combined to reduce the communication resources required at each communication round? Specifically, this work introduces Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD), a novel protocol that integrates Bayesian active learning with compression via a linear mix-up mechanism. Comparisons with existing active learning protocols demonstrate the advantages of the proposed approach.
