LeanTTA: A Backpropagation-Free and Stateless Approach to Quantized Test-Time Adaptation on Edge Devices
Cynthia Dong, Hong Jia, Young D. Kwon, Georgios Rizos, Cecilia Mascolo
TL;DR
LeanTTA addresses the challenge of robust test-time adaptation on resource-constrained edge devices by introducing a backpropagation-free, stateless framework that dynamically updates quantized normalization statistics per data point. It combines per-sample statistic stabilization, Mahalanobis-distance-based balancing, and a partial fusion strategy to enable fast, memory-efficient adaptation without historical data, suitable for 8-bit models. Key contributions include a four-step statistic processing pipeline, a per-sample reset mechanism to prevent model collapse, and demonstrated gains across CIFAR10/100-C and real-world audio datasets with edge hardware constraints, achieving up to $15.7\%$ error reduction and peak memory as low as $11.2$ MB. This approach substantially lowers the barrier to reliable on-device learning under abrupt and gradual domain shifts, offering practical impact for real-time, low-power applications in diverse sensor environments.
Abstract
While there are many advantages to deploying machine learning models on edge devices, the resource constraints of mobile platforms, the dynamic nature of the environment, and differences between the distribution of training versus in-the-wild data make such deployments challenging. Current test-time adaptation methods are often memory-intensive and not designed to be quantization-compatible or deployed on low-resource devices. To address these challenges, we present LeanTTA, a novel backpropagation-free and stateless framework for quantized test-time adaptation tailored to edge devices. Our approach minimizes computational costs by dynamically updating normalization statistics without backpropagation, which frees LeanTTA from the common pitfall of relying on large batches and historical data, making our method robust to realistic deployment scenarios. Our approach is the first to enable further computational gains by combining partial adaptation with quantized module fusion. We validate our framework across sensor modalities, demonstrating significant improvements over state-of-the-art TTA methods, including a 15.7% error reduction, peak memory usage of only 11.2MB for ResNet18, and fast adaptation within an order-of-magnitude of normal inference speeds on-device. LeanTTA provides a robust solution for achieving the right trade offs between accuracy and system efficiency in edge deployments, addressing the unique challenges posed by limited data and varied operational conditions.
