Table of Contents
Fetching ...

SNAP: Low-Latency Test-Time Adaptation with Sparse Updates

Hyeongheon Cha, Dong Min Kim, Hye Won Chung, Taesik Gong, Sung-Ju Lee

TL;DR

SNAP tackles the latency bottleneck of test-time adaptation by introducing sparse, data-efficient updates for edge deployment. It combines Class and Domain Representative Memory (CnDRM) with Inference-only Batch-aware Memory Normalization (IoBMN) to enable effective adaptation with a small data subset and minimal backpropagation. Across CIFAR and ImageNet corruptions, SNAP achieves large latency reductions (up to ~93%) while keeping accuracy losses under ~3.3% and demonstrates compatibility with multiple SOTA TTA methods and models, including ViT with LN. The approach shows robust performance under continuous and single-sample adaptation scenarios and scales to memory-constrained devices, making practical edge deployment feasible. The work also provides ablations, memory analysis, and compatibility with memory-efficient TTA (e.g., MECTA), underscoring its potential for real-world, latency-sensitive applications.

Abstract

Test-Time Adaptation (TTA) adjusts models using unlabeled test data to handle dynamic distribution shifts. However, existing methods rely on frequent adaptation and high computational cost, making them unsuitable for resource-constrained edge environments. To address this, we propose SNAP, a sparse TTA framework that reduces adaptation frequency and data usage while preserving accuracy. SNAP maintains competitive accuracy even when adapting based on only 1% of the incoming data stream, demonstrating its robustness under infrequent updates. Our method introduces two key components: (i) Class and Domain Representative Memory (CnDRM), which identifies and stores a small set of samples that are representative of both class and domain characteristics to support efficient adaptation with limited data; and (ii) Inference-only Batch-aware Memory Normalization (IoBMN), which dynamically adjusts normalization statistics at inference time by leveraging these representative samples, enabling efficient alignment to shifting target domains. Integrated with five state-of-the-art TTA algorithms, SNAP reduces latency by up to 93.12%, while keeping the accuracy drop below 3.3%, even across adaptation rates ranging from 1% to 50%. This demonstrates its strong potential for practical use on edge devices serving latency-sensitive applications. The source code is available at https://github.com/chahh9808/SNAP.

SNAP: Low-Latency Test-Time Adaptation with Sparse Updates

TL;DR

SNAP tackles the latency bottleneck of test-time adaptation by introducing sparse, data-efficient updates for edge deployment. It combines Class and Domain Representative Memory (CnDRM) with Inference-only Batch-aware Memory Normalization (IoBMN) to enable effective adaptation with a small data subset and minimal backpropagation. Across CIFAR and ImageNet corruptions, SNAP achieves large latency reductions (up to ~93%) while keeping accuracy losses under ~3.3% and demonstrates compatibility with multiple SOTA TTA methods and models, including ViT with LN. The approach shows robust performance under continuous and single-sample adaptation scenarios and scales to memory-constrained devices, making practical edge deployment feasible. The work also provides ablations, memory analysis, and compatibility with memory-efficient TTA (e.g., MECTA), underscoring its potential for real-world, latency-sensitive applications.

Abstract

Test-Time Adaptation (TTA) adjusts models using unlabeled test data to handle dynamic distribution shifts. However, existing methods rely on frequent adaptation and high computational cost, making them unsuitable for resource-constrained edge environments. To address this, we propose SNAP, a sparse TTA framework that reduces adaptation frequency and data usage while preserving accuracy. SNAP maintains competitive accuracy even when adapting based on only 1% of the incoming data stream, demonstrating its robustness under infrequent updates. Our method introduces two key components: (i) Class and Domain Representative Memory (CnDRM), which identifies and stores a small set of samples that are representative of both class and domain characteristics to support efficient adaptation with limited data; and (ii) Inference-only Batch-aware Memory Normalization (IoBMN), which dynamically adjusts normalization statistics at inference time by leveraging these representative samples, enabling efficient alignment to shifting target domains. Integrated with five state-of-the-art TTA algorithms, SNAP reduces latency by up to 93.12%, while keeping the accuracy drop below 3.3%, even across adaptation rates ranging from 1% to 50%. This demonstrates its strong potential for practical use on edge devices serving latency-sensitive applications. The source code is available at https://github.com/chahh9808/SNAP.

Paper Structure

This paper contains 80 sections, 15 equations, 9 figures, 29 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of average latency per batch and accuracy between the Original and Naı ve Sparse TTA approaches on edge devices processing an online data stream. With an adaptation rate of 0.33, adaptation occurs once every three batches, reducing latency proportional to the rate but leading to a significant accuracy drop compared with fully adapting Original TTA.
  • Figure 2: Component-wise latency and overall accuracy comparison between full SOTA TTA and SNAP (sparse update with frequency 0.1) on CIFAR100-C, measured on Raspberry Pi 4. SNAP matches accuracy with significantly lower cost.
  • Figure 3: Design overview of SNAP. The framework consists of two primary components: (a) Class and Domain Representative Memory (CnDRM), which efficiently selects representative samples to minimize adaptation overhead, and (b) Inference-only Batch-aware Memory Normalization (IoBMN), which corrects feature distribution shifts during inference. Together, these components implement the Sparse TTA (STTA) strategy, reducing latency while maintaining model accuracy.
  • Figure 4: Sampling visualization and accuracy comparison between the closest 20% and farthest 20% samples from the domain centroid on ImageNet-C Gaussian noise.
  • Figure 5: Latency on Raspberry Pi 4 and CIFAR10-C accuracy across adaptation rates. Due to SNAP’s negligible overhead, solid and dotted lines overlap in the latency plot. Marker size indicates standard deviation.
  • ...and 4 more figures