Table of Contents
Fetching ...

Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach

Xingqiu He, Chaoqun You, Tony Q. S. Quek

TL;DR

This work introduces a refined AoI concept for MEC where status updates are event-driven and require processing, and casts online AoI minimization as a CMDP with energy and bandwidth constraints. To accelerate learning, it leverages Post-Decision States (PDS) to separate known and unknown dynamics and then enhances scalability with a Deep Deterministic Policy Gradient (DDPG) framework tailored to average-reward MDPs. The resulting Deep PDS Learning (DPDS) approach, including cost redesign, input normalization, and specialized neural architectures, significantly outperforms baselines in AoI reduction while respecting energy budgets. The proposed method offers practical benefits for real-time MEC applications by enabling faster convergence, robustness to high-dimensional state/action spaces, and effective joint optimization of computation and communication resources.

Abstract

With the rapid development of Mobile Edge Computing (MEC), various real-time applications have been deployed to benefit people's daily lives. The performance of these applications relies heavily on the freshness of collected environmental information, which can be quantified by its Age of Information (AoI). In the traditional definition of AoI, it is assumed that the status information can be actively sampled and directly used. However, for many MEC-enabled applications, the desired status information is updated in an event-driven manner and necessitates data processing. To better serve these applications, we propose a new definition of AoI and, based on the redefined AoI, we formulate an online AoI minimization problem for MEC systems. Notably, the problem can be interpreted as a Markov Decision Process (MDP), thus enabling its solution through Reinforcement Learning (RL) algorithms. Nevertheless, the traditional RL algorithms are designed for MDPs with completely unknown system dynamics and hence usually suffer long convergence times. To accelerate the learning process, we introduce Post-Decision States (PDSs) to exploit the partial knowledge of the system's dynamics. We also combine PDSs with deep RL to further improve the algorithm's applicability, scalability, and robustness. Numerical results demonstrate that our algorithm outperforms the benchmarks under various scenarios.

Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach

TL;DR

This work introduces a refined AoI concept for MEC where status updates are event-driven and require processing, and casts online AoI minimization as a CMDP with energy and bandwidth constraints. To accelerate learning, it leverages Post-Decision States (PDS) to separate known and unknown dynamics and then enhances scalability with a Deep Deterministic Policy Gradient (DDPG) framework tailored to average-reward MDPs. The resulting Deep PDS Learning (DPDS) approach, including cost redesign, input normalization, and specialized neural architectures, significantly outperforms baselines in AoI reduction while respecting energy budgets. The proposed method offers practical benefits for real-time MEC applications by enabling faster convergence, robustness to high-dimensional state/action spaces, and effective joint optimization of computation and communication resources.

Abstract

With the rapid development of Mobile Edge Computing (MEC), various real-time applications have been deployed to benefit people's daily lives. The performance of these applications relies heavily on the freshness of collected environmental information, which can be quantified by its Age of Information (AoI). In the traditional definition of AoI, it is assumed that the status information can be actively sampled and directly used. However, for many MEC-enabled applications, the desired status information is updated in an event-driven manner and necessitates data processing. To better serve these applications, we propose a new definition of AoI and, based on the redefined AoI, we formulate an online AoI minimization problem for MEC systems. Notably, the problem can be interpreted as a Markov Decision Process (MDP), thus enabling its solution through Reinforcement Learning (RL) algorithms. Nevertheless, the traditional RL algorithms are designed for MDPs with completely unknown system dynamics and hence usually suffer long convergence times. To accelerate the learning process, we introduce Post-Decision States (PDSs) to exploit the partial knowledge of the system's dynamics. We also combine PDSs with deep RL to further improve the algorithm's applicability, scalability, and robustness. Numerical results demonstrate that our algorithm outperforms the benchmarks under various scenarios.
Paper Structure (29 sections, 2 theorems, 42 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 29 sections, 2 theorems, 42 equations, 10 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

There exist non-negative Lagrangian multipliers $\lambda^* = (\lambda^*_1, \dots, \lambda^*_N) \geq 0$ such that the optimal value of the CMDP with initial state $s$ can be computed as where

Figures (10)

  • Figure 1: A simple example of the considered system model.
  • Figure 2: Evolution trace of $a_i(t)$.
  • Figure 3: Algorithm structure of DDPG-based deep PDS learning.
  • Figure 4: Neural network architecture used in our algorithm.
  • Figure 5: Locations of WDs and the BS.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2