Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

Yaofo Chen; Shuaicheng Niu; Yaowei Wang; Shoukai Xu; Hengjie Song; Mingkui Tan

Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

Yaofo Chen, Shuaicheng Niu, Yaowei Wang, Shoukai Xu, Hengjie Song, Mingkui Tan

TL;DR

The paper tackles distribution shifts in cloud-edge deployments by introducing Cloud-Edge Elastic Model Adaptation (CEMA), which moves heavy adaptation to the cloud while the edge performs forward inference. It employs dynamic entropy-based sample filtration to drastically reduce data transmission and uses replay-based knowledge distillation from a foundation model to guide the edge, updating only affine BN parameters for efficiency. Empirically, CEMA achieves state-of-the-art robustness on ImageNet-C and ImageNet-R, with significantly lower upload requirements (e.g., around 60% less than SOTAs on average) and strong performance across CNN and transformer backbones. This approach offers a practical, scalable solution for real-time adaptation in bandwidth-constrained cloud-edge systems and generalizes well to diverse foundation models and tasks, including detection.

Abstract

The conventional deep learning paradigm often involves training a deep model on a server and then deploying the model or its distilled ones to resource-limited edge devices. Usually, the models shall remain fixed once deployed (at least for some period) due to the potential high cost of model adaptation for both the server and edge sides. However, in many real-world scenarios, the test environments may change dynamically (known as distribution shifts), which often results in degraded performance. Thus, one has to adapt the edge models promptly to attain promising performance. Moreover, with the increasing data collected at the edge, this paradigm also fails to further adapt the cloud model for better performance. To address these, we encounter two primary challenges: 1) the edge model has limited computation power and may only support forward propagation; 2) the data transmission budget between cloud and edge devices is limited in latency-sensitive scenarios. In this paper, we establish a Cloud-Edge Elastic Model Adaptation (CEMA) paradigm in which the edge models only need to perform forward propagation and the edge models can be adapted online. In our CEMA, to reduce the communication burden, we devise two criteria to exclude unnecessary samples from uploading to the cloud, i.e., dynamic unreliable and low-informative sample exclusion. Based on the uploaded samples, we update and distribute the affine parameters of normalization layers by distilling from the stronger foundation model to the edge model with a sample replay strategy. Extensive experimental results on ImageNet-C and ImageNet-R verify the effectiveness of our CEMA.

Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

TL;DR

Abstract

Paper Structure (36 sections, 6 equations, 12 figures, 20 tables, 2 algorithms)

This paper contains 36 sections, 6 equations, 12 figures, 20 tables, 2 algorithms.

Introduction
Cloud-Edge Communication-Efficient Model Adaptation
Efficient Adaptation for Robustness and Communication Enhancement
Sample Filtration for Communication Cost Reduction in Edge Side
Replay-based Knowledge Distillation for Adaptation in Cloud Side
Experiments
Performance Comparisons on ImageNet-C
Performance Comparisons on ImageNet-R
Further Experiments
Conclusion
Related Work
More Discussions on CEMA
Transmission efficiency of CEMA
Adaptation throughput and required upload bandwidth
Availability in variable bandwidth scenarios
...and 21 more sections

Figures (12)

Figure 1: Comparisons between the conventional Test-time Adaptation (TTA) (left) and our Cloud-Edge Elastic Model Adaptation (right). The conventional one locally performs adaptation only in the edge with limited resources. In contrast, our CEMA conducts model adaptation more efficiently in the edge, which offloads the heavy adaptation workloads to the cloud with massive resources.
Figure 2: An overview of our proposed CEMA. In edge: after inference, each edge asynchronously uploads samples to the cloud by excluding unreliable ones (based on $S^{high}({\bf x})$ in Eqn. (\ref{['eq:remove-high-ent']})) and low-informative ones (based on $S^{low}({\bf x})$ in Eqn. (\ref{['eq:remove-low-ent']})). In cloud: 1) our CEMA improves the foundation model $f_{\theta}(\cdot)$ with the uploaded samples via entropy minimization (Eqn. \ref{['eq:entropy-minimize-teacher']}) and meanwhile stores uploaded samples into a replay buffer. 2) With both the uploaded samples and the samples randomly sampled from the replay buffer, CEMA adapts the edge model $g_w(\cdot)$ with the guidance from the foundation model $f_{\theta}(\cdot)$ via the knowledge distillation loss (Eqn. \ref{['eq:distillation']}).
Figure 3: Proportions of test samples with $E({\bf x}; w)\small{>}E_{\text{max}}$ (red) and $E({\bf x}; w)\small{\leq}E_{\text{max}}$ (blue) during adaptation via entropy minimization on ImageNet-C.
Figure 4: Comparisons of the average number of uploaded test samples on ImageNet-C with the severity levels 3 and 5.
Figure 5: Effect of $\lambda$ in Eqn. (\ref{['eq:dynamic-E-max']}) with ResNet18 as edge model on ImageNet-C (Gaussian noise, severity level 3).
...and 7 more figures

Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

TL;DR

Abstract

Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (12)