Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks

Changfu Xu; Jianxiong Guo; Wanyu Lin; Haodong Zou; Wentao Fan; Tian Wang; Xiaowen Chu; Jiannong Cao

Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks

Changfu Xu, Jianxiong Guo, Wanyu Lin, Haodong Zou, Wentao Fan, Tian Wang, Xiaowen Chu, Jiannong Cao

TL;DR

This work tackles the latency challenges of AIGC services in distributed edge networks by formulating the offloading problem as an online INLP and proving its NP-hardness offline. It introduces LAD-TS, a diffusion-guided scheduling framework that blends latent action diffusion networks with soft actor-critic reinforcement learning to produce near-optimal, online offloading decisions, complemented by a latent-action diffusion strategy that leverages historical action probabilities for fast convergence. An online distributed algorithm with linear per-slot complexity demonstrates strong delay reductions and training efficiency, and the DEdgeAI prototype shows practical gains on real edge hardware, including substantial memory savings via reSD3-m. Together, the methods deliver meaningful QoE improvements for AIGC at the edge and provide a scalable blueprint for deploying diffusion-assisted DRL in resource-constrained environments.

Abstract

Artificial Intelligence Generated Content (AIGC) has gained significant popularity for creating diverse content. Current AIGC models primarily focus on content quality within a centralized framework, resulting in a high service delay and negative user experiences. However, not only does the workload of an AIGC task depend on the AIGC model's complexity rather than the amount of data, but the large model and its multi-layer encoder structure also result in a huge demand for computational and memory resources. These unique characteristics pose new challenges in its modeling, deployment, and scheduling at edge networks. Thus, we model an offloading problem among edges for providing real AIGC services and propose LAD-TS, a novel Latent Action Diffusion-based Task Scheduling method that orchestrates multiple edge servers for expedited AIGC services. The LAD-TS generates a near-optimal offloading decision by leveraging the diffusion model's conditional generation capability and the reinforcement learning's environment interaction ability, thereby minimizing the service delays under multiple resource constraints. Meanwhile, a latent action diffusion strategy is designed to guide decision generation by utilizing historical action probability, enabling rapid achievement of near-optimal decisions. Furthermore, we develop DEdgeAI, a prototype edge system with a refined AIGC model deployment to implement and evaluate our LAD-TS method. DEdgeAI provides a real AIGC service for users, demonstrating up to 29.18% shorter service delays than the current five representative AIGC platforms. We release our open-source code at https://github.com/ChangfuXu/DEdgeAI/.

Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks

TL;DR

Abstract

Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (6)