AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks

Irene Tenison; Soumyajit Chatterjee; Fahim Kawsar; Mohammad Malekzadeh

AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks

Irene Tenison, Soumyajit Chatterjee, Fahim Kawsar, Mohammad Malekzadeh

TL;DR

AdaBet addresses the challenge of on-device retraining by eliminating backpropagation and server-side meta-training, using topological analysis of activations to select layers. It uses the first Betti Number $b_1$ of layer activations, normalized as $\hat{b}_1^i=b_1^i/|a^i|$, to estimate learning capacity and selects a fraction $\rho$ of layers for retraining, avoiding labels and backpropagation. Evaluations across 16 dataset–model pairs show AdaBet achieves around +5% average accuracy gain over gradient-based baselines while reducing peak memory by about 40%, with substantial per-epoch speedups. The approach offers a privacy-preserving, hardware-friendly route for efficient on-device adaptation.

Abstract

To utilize pre-trained neural networks on edge and mobile devices, we often require efficient adaptation to user-specific runtime data distributions while operating under limited compute and memory resources. On-device retraining with a target dataset can facilitate such adaptations; however, it remains impractical due to the increasing depth of modern neural nets, as well as the computational overhead associated with gradient-based optimization across all layers. Current approaches reduce training cost by selecting a subset of layers for retraining, however, they rely on labeled data, at least one full-model backpropagation, or server-side meta-training; limiting their suitability for constrained devices. We introduce AdaBet, a gradient-free layer selection approach to rank important layers by analyzing topological features of their activation spaces through Betti Numbers and using forward passes alone. AdaBet allows selecting layers with high learning capacity, which are important for retraining and adaptation, without requiring labels or gradients. Evaluating AdaBet on sixteen pairs of benchmark models and datasets, shows AdaBet achieves an average gain of 5% more classification accuracy over gradient-based baselines while reducing average peak memory consumption by 40%.

AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks

TL;DR

Abstract

AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)