Table of Contents
Fetching ...

Enabling Distributed Generative Artificial Intelligence in 6G: Mobile Edge Generation

Ruikang Zhong, Xidong Mu, Mona Jaber, Yuanwei Liu

TL;DR

This paper tackles the bandwidth and latency challenges of generative AI services by proposing Mobile Edge Generation (MEG), a distributed framework that partitions text-to-image generation between edge servers and user equipment. MEG transmits a compressed latent seed, generated by a latent diffusion model (LDM), rather than full images, enabling most computation in machine-perceivable space and reducing data traffic with $|\mathbf{x}_{I^g}|<f_d^{-2}|\mathbf{I}^g|$ for $f_d>1$. A DL-based compression encoder/decoder is trained jointly to minimize latent reconstruction error, and a DRL agent using PPO dynamically allocates transmit power across fading blocks to optimize image quality, measured by $\text{FID}$ and $\text{PSNR}$. Numerical results show MEG reduces transmission overhead and improves image quality in low-$SNR$ scenarios, while the DRL power control further enhances perceptual quality, highlighting MEG’s potential for scalable edge-enabled GAI in 6G networks.

Abstract

Mobile edge generation (MEG) is an emerging technology that allows the network to meet the challenging traffic load expectations posed by the rise of generative artificial intelligence~(GAI). A novel MEG model is proposed for deploying GAI models on edge servers (ES) and user equipment~(UE) to jointly complete text-to-image generation tasks. In the generation task, the ES and UE will cooperatively generate the image according to the text prompt given by the user. To enable the MEG, a pre-trained latent diffusion model (LDM) is invoked to generate the latent feature, and an edge-inferencing MEG protocol is employed for data transmission exchange between the ES and the UE. A compression coding technique is proposed for compressing the latent features to produce seeds. Based on the above seed-enabled MEG model, an image quality optimization problem with transmit power constraint is formulated. The transmitting power of the seed is dynamically optimized by a deep reinforcement learning agent over the fading channel. The proposed MEG enabled text-to-image generation system is evaluated in terms of image quality and transmission overhead. The numerical results indicate that, compared to the conventional centralized generation-and-downloading scheme, the symbol number of the transmission of MEG is materially reduced. In addition, the proposed compression coding approach can improve the quality of generated images under low signal-to-noise ratio (SNR) conditions.

Enabling Distributed Generative Artificial Intelligence in 6G: Mobile Edge Generation

TL;DR

This paper tackles the bandwidth and latency challenges of generative AI services by proposing Mobile Edge Generation (MEG), a distributed framework that partitions text-to-image generation between edge servers and user equipment. MEG transmits a compressed latent seed, generated by a latent diffusion model (LDM), rather than full images, enabling most computation in machine-perceivable space and reducing data traffic with for . A DL-based compression encoder/decoder is trained jointly to minimize latent reconstruction error, and a DRL agent using PPO dynamically allocates transmit power across fading blocks to optimize image quality, measured by and . Numerical results show MEG reduces transmission overhead and improves image quality in low- scenarios, while the DRL power control further enhances perceptual quality, highlighting MEG’s potential for scalable edge-enabled GAI in 6G networks.

Abstract

Mobile edge generation (MEG) is an emerging technology that allows the network to meet the challenging traffic load expectations posed by the rise of generative artificial intelligence~(GAI). A novel MEG model is proposed for deploying GAI models on edge servers (ES) and user equipment~(UE) to jointly complete text-to-image generation tasks. In the generation task, the ES and UE will cooperatively generate the image according to the text prompt given by the user. To enable the MEG, a pre-trained latent diffusion model (LDM) is invoked to generate the latent feature, and an edge-inferencing MEG protocol is employed for data transmission exchange between the ES and the UE. A compression coding technique is proposed for compressing the latent features to produce seeds. Based on the above seed-enabled MEG model, an image quality optimization problem with transmit power constraint is formulated. The transmitting power of the seed is dynamically optimized by a deep reinforcement learning agent over the fading channel. The proposed MEG enabled text-to-image generation system is evaluated in terms of image quality and transmission overhead. The numerical results indicate that, compared to the conventional centralized generation-and-downloading scheme, the symbol number of the transmission of MEG is materially reduced. In addition, the proposed compression coding approach can improve the quality of generated images under low signal-to-noise ratio (SNR) conditions.
Paper Structure (34 sections, 39 equations, 10 figures, 3 tables, 3 algorithms)

This paper contains 34 sections, 39 equations, 10 figures, 3 tables, 3 algorithms.

Figures (10)

  • Figure 1: The macroscopic impact of generative AI information sources on communication systems.
  • Figure 2: System model of MEG.
  • Figure 3: MEG model of LDM-enabled text-to-image generation.
  • Figure 4: Compression encoder for feature transmission.
  • Figure 5: Visualization examples for different generation schemes.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 1
  • Remark 1