Reinforcement Learning With LLMs Interaction For Distributed Diffusion Model Services
Hongyang Du, Ruichen Zhang, Dusit Niyato, Jiawen Kang, Zehui Xiong, Shuguang Cui, Xuemin Shen, Dong In Kim
TL;DR
This work tackles energy-efficient, user-centric QoE optimization for distributed diffusion-model-based AIGC. It introduces a distributed GDM framework where semantically similar prompts share denoising steps to save energy, and couples this with Reinforcement Learning With LLM Interaction (RLLI) that uses LLM-empowered Generative Agents to provide real-time subjective QoE rewards. The authors develop a GDM-based DDPG variant (G-DDPG-LI) to allocate communication and computing resources while accounting for user personalities and wireless dynamics, achieving up to a 15% QoE gain over conventional DDPG methods. The study demonstrates the viability of edge-enabled, personalized AIGC with efficient resource management in future networks, and points to promising avenues like caching and computation reuse to further enhance performance.
Abstract
Distributed Artificial Intelligence-Generated Content (AIGC) has attracted significant attention, but two key challenges remain: maximizing subjective Quality of Experience (QoE) and improving energy efficiency, which are particularly pronounced in widely adopted Generative Diffusion Model (GDM)-based image generation services. In this paper, we propose a novel user-centric Interactive AI (IAI) approach for service management, with a distributed GDM-based AIGC framework that emphasizes efficient and cooperative deployment. The proposed method restructures the GDM inference process by allowing users with semantically similar prompts to share parts of the denoising chain. Furthermore, to maximize the users' subjective QoE, we propose an IAI approach, i.e., Reinforcement Learning With Large Language Models Interaction (RLLI), which utilizes Large Language Model (LLM)-empowered generative agents to replicate user interaction, providing real-time and subjective QoE feedback aligned with diverse user personalities. Lastly, we present the GDM-based Deep Deterministic Policy Gradient (GDDPG) algorithm, adapted to the proposed RLLI framework, to allocate communication and computing resources effectively while accounting for subjective user traits and dynamic wireless conditions. Simulation results demonstrate that G-DDPG improves total QoE by 15% compared with the standard DDPG algorithm.
