ECCENTRIC: Edge-Cloud Collaboration Framework for Distributed Inference Using Knowledge Adaptation
Mohammad Mahdi Kamani, Zhongwei Cheng, Lin Chen
TL;DR
This paper introduces Eccentric, an edge-cloud collaboration framework that learns Pareto-optimal models for distributed inference by transferring knowledge from edge to cloud. It presents three architectures—Independent ECC, Adaptive ECC, and Dynamic ECC—along with training strategies including knowledge distillation, knowledge adaptation, and recall-rate boosting, all aimed at reducing cloud offload while preserving performance. New evaluation criteria are defined to quantify communication, computation, and performance trade-offs, and the approach is validated on CIFAR-10 classification and COCO/YOLOv5-based object detection, demonstrating near cloud-level performance with substantial resource savings. The work offers a flexible, compression-like mechanism for edge-cloud inference and points to potential extensions to broader tasks and generative adaptation methods.
Abstract
The massive growth in the utilization of edge AI has made the applications of machine learning models ubiquitous in different domains. Despite the computation and communication efficiency of these systems, due to limited computation resources on edge devices, relying on more computationally rich systems on the cloud side is inevitable in most cases. Cloud inference systems can achieve the best performance while the computation and communication cost is dramatically increasing by the expansion of a number of edge devices relying on these systems. Hence, there is a trade-off between the computation, communication, and performance of these systems. In this paper, we propose a novel framework, dubbed as Eccentric that learns models with different levels of trade-offs between these conflicting objectives. This framework, based on an adaptation of knowledge from the edge model to the cloud one, reduces the computation and communication costs of the system during inference while achieving the best performance possible. The Eccentric framework can be considered as a new form of compression method suited for edge-cloud inference systems to reduce both computation and communication costs. Empirical studies on classification and object detection tasks corroborate the efficacy of this framework.
