Table of Contents
Fetching ...

Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks

Zhang Liu, Hongyang Du, Lianfen Huang, Zhibin Gao, Dusit Niyato

TL;DR

A deep deterministic policy gradient-based reinforcement learning approach is employed, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response to user mobility and time-varying channel conditions, that achieves a higher model hit ratio and provides superior-quality, lower-latency AIGC services compared to other benchmark solutions.

Abstract

With the rapid advancement of artificial intelligence (AI), generative AI (GenAI) has emerged as a transformative tool, enabling customized and personalized AI-generated content (AIGC) services. However, GenAI models with billions of parameters require substantial memory capacity and computational power for deployment and execution, presenting significant challenges to resource-limited edge networks. In this paper, we address the joint model caching and resource allocation problem in GenAI-enabled wireless edge networks. Our objective is to balance the trade-off between delivering high-quality AIGC and minimizing the delay in AIGC service provisioning. To tackle this problem, we employ a deep deterministic policy gradient (DDPG)-based reinforcement learning approach, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response to user mobility and time-varying channel conditions. Numerical results demonstrate that DDPG achieves a higher model hit ratio and provides superior-quality, lower-latency AIGC services compared to other benchmark solutions.

Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks

TL;DR

A deep deterministic policy gradient-based reinforcement learning approach is employed, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response to user mobility and time-varying channel conditions, that achieves a higher model hit ratio and provides superior-quality, lower-latency AIGC services compared to other benchmark solutions.

Abstract

With the rapid advancement of artificial intelligence (AI), generative AI (GenAI) has emerged as a transformative tool, enabling customized and personalized AI-generated content (AIGC) services. However, GenAI models with billions of parameters require substantial memory capacity and computational power for deployment and execution, presenting significant challenges to resource-limited edge networks. In this paper, we address the joint model caching and resource allocation problem in GenAI-enabled wireless edge networks. Our objective is to balance the trade-off between delivering high-quality AIGC and minimizing the delay in AIGC service provisioning. To tackle this problem, we employ a deep deterministic policy gradient (DDPG)-based reinforcement learning approach, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response to user mobility and time-varying channel conditions. Numerical results demonstrate that DDPG achieves a higher model hit ratio and provides superior-quality, lower-latency AIGC services compared to other benchmark solutions.

Paper Structure

This paper contains 29 sections, 18 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: A schematic of the user-edge-cloud orchestrated architecture for provisioning image-generating AIGC services.
  • Figure 2: The architecture of the DDPG algorithm.
  • Figure 3: Learning rate impact.
  • Figure 4: User number impact.
  • Figure 5: User number impact.