Table of Contents
Fetching ...

Attention-Enhanced Prioritized Proximal Policy Optimization for Adaptive Edge Caching

Farnaz Niknia, Ping Wang, Zixu Wang, Aakash Agarwal, Adib S. Rezaei

TL;DR

This work model the caching problem using a Semi-Markov Decision Process (SMDP) to capture the continuous-time nature of real-world applications better, enabling caching decisions to be triggered by random file requests and introduces a Proximal Policy Optimization (PPO)-based caching strategy that fully considers file attributes like lifetime, size, and priority.

Abstract

This paper tackles the growing issue of excessive data transmission in networks. With increasing traffic, backhaul links and core networks are under significant traffic, leading to the investigation of caching solutions at edge routers. Many existing studies utilize Markov Decision Processes (MDP) to tackle caching problems, often assuming decision points at fixed intervals; however, real-world environments are characterized by random request arrivals. Additionally, critical file attributes such as lifetime, size, and priority significantly impact the effectiveness of caching policies, yet existing research fails to integrate all these attributes in policy design. In this work, we model the caching problem using a Semi-Markov Decision Process (SMDP) to better capture the continuous-time nature of real-world applications, enabling caching decisions to be triggered by random file requests. We then introduce a Proximal Policy Optimization (PPO)--based caching strategy that fully considers file attributes like lifetime, size, and priority. Simulations show that our method outperforms a recent Deep Reinforcement Learning-based technique. To further advance our research, we improved the convergence rate of PPO by prioritizing transitions within the replay buffer through an attention mechanism. This mechanism evaluates the similarity between the current state and all stored transitions, assigning higher priorities to transitions that exhibit greater similarity.

Attention-Enhanced Prioritized Proximal Policy Optimization for Adaptive Edge Caching

TL;DR

This work model the caching problem using a Semi-Markov Decision Process (SMDP) to capture the continuous-time nature of real-world applications better, enabling caching decisions to be triggered by random file requests and introduces a Proximal Policy Optimization (PPO)-based caching strategy that fully considers file attributes like lifetime, size, and priority.

Abstract

This paper tackles the growing issue of excessive data transmission in networks. With increasing traffic, backhaul links and core networks are under significant traffic, leading to the investigation of caching solutions at edge routers. Many existing studies utilize Markov Decision Processes (MDP) to tackle caching problems, often assuming decision points at fixed intervals; however, real-world environments are characterized by random request arrivals. Additionally, critical file attributes such as lifetime, size, and priority significantly impact the effectiveness of caching policies, yet existing research fails to integrate all these attributes in policy design. In this work, we model the caching problem using a Semi-Markov Decision Process (SMDP) to better capture the continuous-time nature of real-world applications, enabling caching decisions to be triggered by random file requests. We then introduce a Proximal Policy Optimization (PPO)--based caching strategy that fully considers file attributes like lifetime, size, and priority. Simulations show that our method outperforms a recent Deep Reinforcement Learning-based technique. To further advance our research, we improved the convergence rate of PPO by prioritizing transitions within the replay buffer through an attention mechanism. This mechanism evaluates the similarity between the current state and all stored transitions, assigning higher priorities to transitions that exhibit greater similarity.
Paper Structure (33 sections, 15 equations, 8 figures, 2 tables)

This paper contains 33 sections, 15 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Caching system topology
  • Figure 2: Total hit counts for different values for $\eta$
  • Figure 3: Total utility for a) different values for $\eta$
  • Figure 4: Total hit count for different values for $\lambda$
  • Figure 5: Total utility for different values for $\lambda$
  • ...and 3 more figures