Table of Contents
Fetching ...

An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud Model and Small Edge Models

Yuqing Tian, Zhaoyang Zhang, Yuzhi Yang, Zirui Chen, Zhaohui Yang, Richeng Jin, Tony Q. S. Quek, Kai-Kit Wong

TL;DR

This paper tackles the prohibitive compute and communication costs of centralized big AI model (BAIM) deployment by proposing a bottom-up edge-cloud framework that fuses a synergetic big cloud model with lightweight edge models. It introduces a hierarchical, mixture-of-experts–style BAIM architecture with gating and linear projection connections, enabling distributed training across edge nodes and task-oriented deployment of compact models at the edge. A cloud-centric training process aggregates diverse edge models into a unified BAIM, which is subsequently partitioned into task-specific lightweight models for edge inference, with edge personalization through fine-tuning. A case study on image generation using heterogeneous VAEs demonstrates that the finetuning strategy yields the best image quality, validating the approach and highlighting edge-enabled privacy, timeliness, and personalization in 6G networks.

Abstract

Generative artificial intelligence (GenAI) offers various services to users through content creation, which is believed to be one of the most important components in future networks. However, training and deploying big artificial intelligence models (BAIMs) introduces substantial computational and communication overhead.This poses a critical challenge to centralized approaches, due to the need of high-performance computing infrastructure and the reliability, secrecy and timeliness issues in long-distance access of cloud services. Therefore, there is an urging need to decentralize the services, partly moving them from the cloud to the edge and establishing native GenAI services to enable private, timely, and personalized experiences. In this paper, we propose a brand-new bottom-up BAIM architecture with synergetic big cloud model and small edge models, and design a distributed training framework and a task-oriented deployment scheme for efficient provision of native GenAI services. The proposed framework can facilitate collaborative intelligence, enhance adaptability, gather edge knowledge and alleviate edge-cloud burden. The effectiveness of the proposed framework is demonstrated through an image generation use case. Finally, we outline fundamental research directions to fully exploit the collaborative potential of edge and cloud for native GenAI and BAIM applications.

An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud Model and Small Edge Models

TL;DR

This paper tackles the prohibitive compute and communication costs of centralized big AI model (BAIM) deployment by proposing a bottom-up edge-cloud framework that fuses a synergetic big cloud model with lightweight edge models. It introduces a hierarchical, mixture-of-experts–style BAIM architecture with gating and linear projection connections, enabling distributed training across edge nodes and task-oriented deployment of compact models at the edge. A cloud-centric training process aggregates diverse edge models into a unified BAIM, which is subsequently partitioned into task-specific lightweight models for edge inference, with edge personalization through fine-tuning. A case study on image generation using heterogeneous VAEs demonstrates that the finetuning strategy yields the best image quality, validating the approach and highlighting edge-enabled privacy, timeliness, and personalization in 6G networks.

Abstract

Generative artificial intelligence (GenAI) offers various services to users through content creation, which is believed to be one of the most important components in future networks. However, training and deploying big artificial intelligence models (BAIMs) introduces substantial computational and communication overhead.This poses a critical challenge to centralized approaches, due to the need of high-performance computing infrastructure and the reliability, secrecy and timeliness issues in long-distance access of cloud services. Therefore, there is an urging need to decentralize the services, partly moving them from the cloud to the edge and establishing native GenAI services to enable private, timely, and personalized experiences. In this paper, we propose a brand-new bottom-up BAIM architecture with synergetic big cloud model and small edge models, and design a distributed training framework and a task-oriented deployment scheme for efficient provision of native GenAI services. The proposed framework can facilitate collaborative intelligence, enhance adaptability, gather edge knowledge and alleviate edge-cloud burden. The effectiveness of the proposed framework is demonstrated through an image generation use case. Finally, we outline fundamental research directions to fully exploit the collaborative potential of edge and cloud for native GenAI and BAIM applications.
Paper Structure (27 sections, 5 figures, 1 table)

This paper contains 27 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Three frameworks for training and deploying AI models with cloud-edge collaboration. Model compression follows a centralized approach, resulting in smaller models specialized with personal datasets through methods like KD. Model aggregation merges edge-trained models in an iterative process within the cloud. Model partitioning involves the joint training and deployment of models by splitting them across different nodes.
  • Figure 2: KPI radar chart of four distributed frameworks over edge-cloud networks.
  • Figure 3: The workflow of our proposed framework with BAIM training and native GenAI service procedures.
  • Figure 4: The bottom-up BAIM architecture and task-oriented partitioned models in the toolkit, involving three tasks. The second task is currently selected by TSGate. Dark modules are executed, including top-$k$ ($k=2$) learners chosen by LSGate and modules with linear projection connections to these learners while light modules are inactive in the current round. In the third task, gray dashed lines denote the initial potential linear projection (connection height $h=2$) originating from the first learner. During training, pruning filters and reserves a proportion of them, depicted as gray solid lines.
  • Figure 5: The case study on image generation service provision.