SNAP: Low-Latency Test-Time Adaptation with Sparse Updates
Hyeongheon Cha, Dong Min Kim, Hye Won Chung, Taesik Gong, Sung-Ju Lee
TL;DR
SNAP tackles the latency bottleneck of test-time adaptation by introducing sparse, data-efficient updates for edge deployment. It combines Class and Domain Representative Memory (CnDRM) with Inference-only Batch-aware Memory Normalization (IoBMN) to enable effective adaptation with a small data subset and minimal backpropagation. Across CIFAR and ImageNet corruptions, SNAP achieves large latency reductions (up to ~93%) while keeping accuracy losses under ~3.3% and demonstrates compatibility with multiple SOTA TTA methods and models, including ViT with LN. The approach shows robust performance under continuous and single-sample adaptation scenarios and scales to memory-constrained devices, making practical edge deployment feasible. The work also provides ablations, memory analysis, and compatibility with memory-efficient TTA (e.g., MECTA), underscoring its potential for real-world, latency-sensitive applications.
Abstract
Test-Time Adaptation (TTA) adjusts models using unlabeled test data to handle dynamic distribution shifts. However, existing methods rely on frequent adaptation and high computational cost, making them unsuitable for resource-constrained edge environments. To address this, we propose SNAP, a sparse TTA framework that reduces adaptation frequency and data usage while preserving accuracy. SNAP maintains competitive accuracy even when adapting based on only 1% of the incoming data stream, demonstrating its robustness under infrequent updates. Our method introduces two key components: (i) Class and Domain Representative Memory (CnDRM), which identifies and stores a small set of samples that are representative of both class and domain characteristics to support efficient adaptation with limited data; and (ii) Inference-only Batch-aware Memory Normalization (IoBMN), which dynamically adjusts normalization statistics at inference time by leveraging these representative samples, enabling efficient alignment to shifting target domains. Integrated with five state-of-the-art TTA algorithms, SNAP reduces latency by up to 93.12%, while keeping the accuracy drop below 3.3%, even across adaptation rates ranging from 1% to 50%. This demonstrates its strong potential for practical use on edge devices serving latency-sensitive applications. The source code is available at https://github.com/chahh9808/SNAP.
