Table of Contents
Fetching ...

Delay-Optimal Transmission Scheduling Policies for Time-Correlated Fading Channels

Manali Dutta, Gourav Saha, Rahul Singh, Ness B. Shroff

TL;DR

This work designs dynamic scheduling policies that minimize end-to-end packet delays while keeping packet transmission costs low and is the first POMDP formulation for mmWave network with partial channel state information that considers delay minimization.

Abstract

Millimeter-wave (mmWave) networks have the potential to support high throughput and low-latency requirements of 5G-and-beyond communication standards. But transmissions in this band are highly vulnerable to attenuation and blockages from humans, buildings, and foliage, which increase end-to-end packet delays. This work designs dynamic scheduling policies that minimize end-to-end packet delays while keeping packet transmission costs low. Specifically, we consider a mmWave network that consists of a transmitter that transmits data packets over an unreliable communication channel modeled as a Gilbert-Elliott channel.The transmitter operates under an ACK/NACK feedback model and does not observe the channel state unless it attempts a transmission. The objective is to minimize a weighted average cost consisting of end-to-end packet delays and packet transmission costs. We pose this dynamic optimization problem as a partially observable Markov decision process (POMDP). To the best of our knowledge, this is the first POMDP formulation for mmWave network with partial channel state information that considers delay minimization. We show that the POMDP admits a solution that has a threshold structure, i.e., for each queue length, the belief (the conditional probability that the channel is in a good state) is partitioned into intervals, and the transmitter sends j packets when the belief lies in the j-th interval. We then consider the case when the system parameters such as the packet arrival rate, and the transition probabilities of the channel are not known, and leverage these structural results in order to use the actor-critic algorithm to efficiently search for a policy that is locally optimal.

Delay-Optimal Transmission Scheduling Policies for Time-Correlated Fading Channels

TL;DR

This work designs dynamic scheduling policies that minimize end-to-end packet delays while keeping packet transmission costs low and is the first POMDP formulation for mmWave network with partial channel state information that considers delay minimization.

Abstract

Millimeter-wave (mmWave) networks have the potential to support high throughput and low-latency requirements of 5G-and-beyond communication standards. But transmissions in this band are highly vulnerable to attenuation and blockages from humans, buildings, and foliage, which increase end-to-end packet delays. This work designs dynamic scheduling policies that minimize end-to-end packet delays while keeping packet transmission costs low. Specifically, we consider a mmWave network that consists of a transmitter that transmits data packets over an unreliable communication channel modeled as a Gilbert-Elliott channel.The transmitter operates under an ACK/NACK feedback model and does not observe the channel state unless it attempts a transmission. The objective is to minimize a weighted average cost consisting of end-to-end packet delays and packet transmission costs. We pose this dynamic optimization problem as a partially observable Markov decision process (POMDP). To the best of our knowledge, this is the first POMDP formulation for mmWave network with partial channel state information that considers delay minimization. We show that the POMDP admits a solution that has a threshold structure, i.e., for each queue length, the belief (the conditional probability that the channel is in a good state) is partitioned into intervals, and the transmitter sends j packets when the belief lies in the j-th interval. We then consider the case when the system parameters such as the packet arrival rate, and the transition probabilities of the channel are not known, and leverage these structural results in order to use the actor-critic algorithm to efficiently search for a policy that is locally optimal.

Paper Structure

This paper contains 17 sections, 18 theorems, 49 equations, 13 figures, 1 algorithm.

Key Result

Theorem 1

Consider the $\beta$-discounted POMDP (def:discountedpomdp). There exists a threshold policy that is optimal.

Figures (13)

  • Figure 1: Our system model with the arrival process, $A\left(t\right)$, queue length $Q\left(t\right)$, channel state, $s\left(t\right)$, and the decision variable, $u\left(t\right)$.
  • Figure 2: Pictorial depiction of the threshold structure of policy that is optimal for the $\beta$-discounted problem for $M_{d}=2$: (a) Diagram showing that the set $\mathcal{S}_{q}^{(\beta)}\left(u\right)$ are contiguous intervals for $u\geq1$. The blue arrows in "incorrect structure" show that the set $\mathcal{S}_{q}^{(\beta)}\left(1\right)$ consists of two disjoint intervals. This is not possible. (b) Diagram showing the ordering of the set $\mathcal{S}_{q}^{(\beta)}\left(u\right)$ for $u\geq1$. The blue arrows in "incorrect structure" show that the elements of set $\mathcal{S}_{q}^{(\beta)}\left(2\right)$ is smaller than that of $\mathcal{S}_{q}^{(\beta)}\left(1\right)$. This is not possible. (c) (a) Diagram showing that the set $\mathcal{S}_{q}^{(\beta)}\left(0\right)$ is a contiguous interval. The blue arrows in "incorrect structure" show that the set $\mathcal{S}_{q}^{(\beta)}\left(0\right)$ consists of two disjoint intervals. This is not possible.
  • Figure 3: Comparison of optimal policy obtained using RVI with two sub-optimal policies: (a) The weight $\kappa$ in \ref{['def:pomdp']} is varied with fixed $p_{01} = 0.2$, $p_{11} = 0.9$, $p_1 = 0.9$, (b) Arrival probability $p_{1}$ is varied with fixed $\kappa =1$, $p_{01} = 0.2$, $p_{11} = 0.9$, and (c) Markovian blockage parameters are varied, here $\delta_p = |p_{11} - p_{01}|$ with $p_{11}$ fixed at $0.9$, as $\delta_p$ is varied and $\kappa =1$.
  • Figure 4: Decision regions for $M_d = 2$. For the state $(q,b)$, if we have $b \in [0, \tau^{(1)}(q))$, then it is optimal to transmit $0$ packets, else if $b \in [\tau^{(1)}(q), \tau^{(2)}(q))$, then transmitting $1$ packet is optimal, otherwise if $b \in [\tau^{(2)}(q), 1]$, then it is optimal to transmit $2$ packets. In general, for the state $(q,b)$ if $b \in [\tau^{(j)}(q), \tau^{(j+1)}(q)), j = 0, 1, \ldots, M_d$ with $\tau^0(q) = 0$ and $\tau^{M_d + 1}(q) = 1$, then it is optimal to transmit $j$ packets.
  • Figure 5: (a) Threshold-type policy for $M_d = 1$ with $\tau^{(1)}(q) = 0.9 - 0.08q$, and (b), (c), (d), are its approximation for different values of $\theta_3$ for the parameterized policy $\pi_{\theta}(1|q,b) = \frac{1}{1 + \exp(- {\theta_3} (b - \tau^{(1)}(q)))}$.
  • ...and 8 more figures

Theorems & Definitions (19)

  • Definition 1
  • Theorem 1
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Theorem 2
  • Proposition 5
  • Theorem 3
  • Theorem 4
  • ...and 9 more