Enabling Distributed Generative Artificial Intelligence in 6G: Mobile Edge Generation
Ruikang Zhong, Xidong Mu, Mona Jaber, Yuanwei Liu
TL;DR
This paper tackles the bandwidth and latency challenges of generative AI services by proposing Mobile Edge Generation (MEG), a distributed framework that partitions text-to-image generation between edge servers and user equipment. MEG transmits a compressed latent seed, generated by a latent diffusion model (LDM), rather than full images, enabling most computation in machine-perceivable space and reducing data traffic with $|\mathbf{x}_{I^g}|<f_d^{-2}|\mathbf{I}^g|$ for $f_d>1$. A DL-based compression encoder/decoder is trained jointly to minimize latent reconstruction error, and a DRL agent using PPO dynamically allocates transmit power across fading blocks to optimize image quality, measured by $\text{FID}$ and $\text{PSNR}$. Numerical results show MEG reduces transmission overhead and improves image quality in low-$SNR$ scenarios, while the DRL power control further enhances perceptual quality, highlighting MEG’s potential for scalable edge-enabled GAI in 6G networks.
Abstract
Mobile edge generation (MEG) is an emerging technology that allows the network to meet the challenging traffic load expectations posed by the rise of generative artificial intelligence~(GAI). A novel MEG model is proposed for deploying GAI models on edge servers (ES) and user equipment~(UE) to jointly complete text-to-image generation tasks. In the generation task, the ES and UE will cooperatively generate the image according to the text prompt given by the user. To enable the MEG, a pre-trained latent diffusion model (LDM) is invoked to generate the latent feature, and an edge-inferencing MEG protocol is employed for data transmission exchange between the ES and the UE. A compression coding technique is proposed for compressing the latent features to produce seeds. Based on the above seed-enabled MEG model, an image quality optimization problem with transmit power constraint is formulated. The transmitting power of the seed is dynamically optimized by a deep reinforcement learning agent over the fading channel. The proposed MEG enabled text-to-image generation system is evaluated in terms of image quality and transmission overhead. The numerical results indicate that, compared to the conventional centralized generation-and-downloading scheme, the symbol number of the transmission of MEG is materially reduced. In addition, the proposed compression coding approach can improve the quality of generated images under low signal-to-noise ratio (SNR) conditions.
