Structure-Enhanced DRL for Optimal Transmission Scheduling

Jiazheng Chen; Wanchun Liu; Daniel E. Quevedo; Saeed R. Khosravirad; Yonghui Li; Branka Vucetic

Structure-Enhanced DRL for Optimal Transmission Scheduling

Jiazheng Chen, Wanchun Liu, Daniel E. Quevedo, Saeed R. Khosravirad, Yonghui Li, Branka Vucetic

TL;DR

A structure-enhanced deep reinforcement learning framework for optimal scheduling of the remote estimation system to achieve the minimum overall estimation mean-square error (MSE), and proposes a structure-enhanced action selection method, which tends to select actions that obey the policy structure.

Abstract

Remote state estimation of large-scale distributed dynamic processes plays an important role in Industry 4.0 applications. In this paper, we focus on the transmission scheduling problem of a remote estimation system. First, we derive some structural properties of the optimal sensor scheduling policy over fading channels. Then, building on these theoretical guidelines, we develop a structure-enhanced deep reinforcement learning (DRL) framework for optimal scheduling of the system to achieve the minimum overall estimation mean-square error (MSE). In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure. This explores the action space more effectively and enhances the learning efficiency of DRL agents. Furthermore, we introduce a structure-enhanced loss function to add penalties to actions that do not follow the policy structure. The new loss function guides the DRL to converge to the optimal policy structure quickly. Our numerical experiments illustrate that the proposed structure-enhanced DRL algorithms can save the training time by 50% and reduce the remote estimation MSE by 10% to 25% when compared to benchmark DRL algorithms. In addition, we show that the derived structural properties exist in a wide range of dynamic scheduling problems that go beyond remote state estimation.

Structure-Enhanced DRL for Optimal Transmission Scheduling

TL;DR

Abstract

Paper Structure (29 sections, 9 theorems, 77 equations, 8 figures, 2 tables, 2 algorithms)

This paper contains 29 sections, 9 theorems, 77 equations, 8 figures, 2 tables, 2 algorithms.

Introduction
System Model
Dynamic Process Model and Local State Estimation
Wireless Communications and Remote State Estimation
Problem Formulation and Threshold structure
MDP Formulation
Threshold Structure of the Optimal MDP Solution
Threshold Structure of Optimal Policies
Channel-State Threshold Property
AoI-State Threshold Property
Two-sensor-single-channel systems
Multi-sensor-single-channel systems
Multi-sensor-multi-channel systems
Structure-Enhanced DRL
Structure-Enhanced DQN
...and 14 more sections

Key Result

Lemma 1

If the optimal policy exists, then the operator $\mathsf{B}$ has a unique fixed point $V^{*} \in \mathcal{V}$ and for all $V^{0} \in \mathcal{V}$, the sequence $\{V^{\tilde{t}}\}$ defined by $V^{\tilde{t}+1} = \mathsf{B} [V^{\tilde{t}}]$ converges in norm to $V^{*}$, i.e.

Figures (8)

Figure 1: Remote state estimation system with $N$ processes and $M$ channels.
Figure 2: Structure of the optimal scheduling policy with $N=2$ and $M=1$, where $\bullet$ and $\times$ represent the schedule of sensor 1 and 2, respectively.
Figure 3: The optimal policy of a two-sensor-single-channel scheduling problem with the multiplicative reward function, where $\bullet$ and $\times$ represent the schedule of sensor 1 and 2, respectively.
Figure 4: Average sum MSE of all processes during training with $N=6, M=3$.
Figure 5: Average sum MSE of all processes during training with $N\!=10, M\!=5$.
...and 3 more figures

Theorems & Definitions (23)

Definition 1: Channel-State Threshold Policy
Definition 2: AoI-State Threshold Policy
Lemma 1: puterman1990markovhernandez2012further
Lemma 2: Monotonicity
proof
Theorem 1
proof
Theorem 2
Remark 1: Analytical Challenges
proof
...and 13 more

Structure-Enhanced DRL for Optimal Transmission Scheduling

TL;DR

Abstract

Structure-Enhanced DRL for Optimal Transmission Scheduling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (23)