Table of Contents
Fetching ...

Beam Scheduling in Millimeter Wave Networks Using the Whittle Index

Mandar R. Nalavade, Ravindra S. Tomar, Gaurav S. Kasbekar

TL;DR

This work addresses downlink beam scheduling in a single mmWave cell with limited beams by formulating it as a restless multi-armed bandit problem. Using Whittle index theory, the authors prove indexability and derive a per-user MDP whose optimal policy is a threshold in the queue length; a Whittle index is computed for each user and the scheduling decision in each slot selects the B users with the smallest indices. Empirical results show that the Whittle-index policy outperforms Longest-Queue-First, Max-Weight, Weighted Fair Queuing, and Random scheduling in terms of long-run cost, delay, and energy efficiency. This approach offers a scalable, principled method to manage beam allocation in mmWave networks and can extend to more complex traffic and multi-packet settings.

Abstract

We address the problem of beam scheduling for downlink transmissions in a single-cell millimeter wave (mmWave) network. The cell contains a mmWave base station (mBS) and its associated users. At the end of each time slot, a packet arrives into the queue of a user at the mBS with a certain probability. A holding cost is incurred for the packets stored in a user's queue at the mBS in every time slot. The number of simultaneous beams that the mBS can form to different users is less than the number of associated users. Also, a cost is incurred whenever a beam is formed from the mBS to a user. In a given time slot, a packet transmitted from the mBS to a user that has been assigned a beam is successfully received (respectively, not received) if the channel quality between the mBS and the user is good (respectively, bad). In every time slot, the mBS needs to assign the available beams to a subset of the users, in order to minimize the long-run expected average cost. This problem can be modeled as a restless multi-armed bandit problem, which is provably hard to solve. We prove the Whittle indexability of the above beam scheduling problem and propose a strategy to compute the Whittle index of each user. In each time slot, our proposed beam scheduling policy assigns beams to the users with the smallest Whittle indices. Using extensive simulations, we show that our proposed Whittle index-based beam scheduling policy significantly outperforms several scheduling policies proposed in prior work in terms of the average cost, average delay, as well as energy efficiency.

Beam Scheduling in Millimeter Wave Networks Using the Whittle Index

TL;DR

This work addresses downlink beam scheduling in a single mmWave cell with limited beams by formulating it as a restless multi-armed bandit problem. Using Whittle index theory, the authors prove indexability and derive a per-user MDP whose optimal policy is a threshold in the queue length; a Whittle index is computed for each user and the scheduling decision in each slot selects the B users with the smallest indices. Empirical results show that the Whittle-index policy outperforms Longest-Queue-First, Max-Weight, Weighted Fair Queuing, and Random scheduling in terms of long-run cost, delay, and energy efficiency. This approach offers a scalable, principled method to manage beam allocation in mmWave networks and can extend to more complex traffic and multi-packet settings.

Abstract

We address the problem of beam scheduling for downlink transmissions in a single-cell millimeter wave (mmWave) network. The cell contains a mmWave base station (mBS) and its associated users. At the end of each time slot, a packet arrives into the queue of a user at the mBS with a certain probability. A holding cost is incurred for the packets stored in a user's queue at the mBS in every time slot. The number of simultaneous beams that the mBS can form to different users is less than the number of associated users. Also, a cost is incurred whenever a beam is formed from the mBS to a user. In a given time slot, a packet transmitted from the mBS to a user that has been assigned a beam is successfully received (respectively, not received) if the channel quality between the mBS and the user is good (respectively, bad). In every time slot, the mBS needs to assign the available beams to a subset of the users, in order to minimize the long-run expected average cost. This problem can be modeled as a restless multi-armed bandit problem, which is provably hard to solve. We prove the Whittle indexability of the above beam scheduling problem and propose a strategy to compute the Whittle index of each user. In each time slot, our proposed beam scheduling policy assigns beams to the users with the smallest Whittle indices. Using extensive simulations, we show that our proposed Whittle index-based beam scheduling policy significantly outperforms several scheduling policies proposed in prior work in terms of the average cost, average delay, as well as energy efficiency.

Paper Structure

This paper contains 15 sections, 8 theorems, 47 equations, 5 figures, 2 tables.

Key Result

Lemma 1

The quantities $V(\cdot)$ and $\eta$ satisfying Value_Function can be derived as: $\lim_{\gamma \uparrow 1} \bar{V}^{\gamma}(\cdot) = V(\cdot)$ and $\lim_{\gamma \uparrow 1} (1-\gamma) V^{\gamma}(0) = \eta$. The constant $\eta$ in Value_Function is unique and is equal to the optimal long-run expecte

Figures (5)

  • Figure 1: The figure shows an example network with $K=4$ and $B=3$.
  • Figure 2: The figure shows that for every state greater than $t$, a beam is allocated to the user and hence a packet departure can take place, while for every state below or equal to $t$, a beam is not allocated to the user, resulting in no packet departure.
  • Figure 3: The figures compare the average costs achieved under the five beam scheduling policies. The following parameter values are used for figure (a): $K=6$, $B=4$, buffer size $=400$, $\textbf{d}=[0.35,0.33,0.31,0.29,$$0.27,0.25]$, $\textbf{a} = [0.55,0.52,0.49,0.46,0.43,0.4]$, $\textbf{P} = [60,55,50,45,40,$$35]$, and $\textbf{q} = [30,26,22,18,14,10]$. The following parameter values are used for figure (b): $K=4$, $B=3$, buffer size $=500$, $\textbf{d}=[0.34,0.3,0.28,$$0.32]$, $\textbf{a} = [0.58,0.56,0.57,0.55]$, $\textbf{P} = [87,74,62,49]$, and $\textbf{q} = [90,60,44,28]$.
  • Figure 4: The figures compare the average costs achieved under the five beam scheduling policies. Buffer size $=200$ for both the plots. The following parameter values are used for figure (a): $B=4$, and different values of $K$ varying from $5$ to $10$. For $K=5$, the following parameter values are used: $\textbf{d}=[0.30,0.28,0.29,0.31,0.28]$, $\textbf{a} = [0.52,0.51,0.5,$$0.49,0.48]$, $\textbf{P} = [60,57,54,51,48]$, and $\textbf{q} = [80,75,70,65,60]$. For every subsequent addition of the $i^{th}$ user, $i \in \{6, \ldots, 10\}$, the values of $d_i$, $a_i$, $P_i$, and $q_i$ are selected as $0.28 \times (i \mod 2) + 0.29 \times ((i+1) \mod 2)$, $0.53-0.01i$, $63-3i$, and $85- 5i$, respectively, where $x \mod y$ denotes the remainder when $x$ is divided by $y$. The following parameter values are used for figure (b): $K=9$, $\textbf{d}=[0.25,0.241,0.231,0.222,0.213,0.204,$$0.195,0.186,0.177]$, $\textbf{a} = [0.55,0.545,0.54,0.535,0.53,0.525,0.52,0.515,0.51]$, $\textbf{P} = [120,$$110,100,90,80,70,60,50,40]$, $\textbf{q} = [90,82,74,66,58,50,42,34,26]$, and different values of $B$ varying from $4$ to $8$.
  • Figure 5: The figures compare the average delays achieved under the five beam scheduling policies. Buffer size $=200$ for both the plots. The following parameter values are used for figure (a): $B=4$, and different values of $K$ varying from $5$ to $9$. For $K=5$, the following parameter values are used: $\textbf{d}=[0.29,0.285,0.28,0.275,0.27]$, $\textbf{a} = [0.56,0.53,$$0.50,0.47,0.44]$, $\textbf{P} = [56,52,48,44,40]$, and $\textbf{q} = [82,78,74,70,66]$. For every subsequent user $i \in \{6, \ldots, 9\}$, the values of $d_i, a_i, P_i$, and $q_i$ are selected as $0.295-0.005i$, $0.59-0.03i$, $60-4i$, and $86-4i$, respectively. The following parameter values are used for figure (b): $K=9$, $\textbf{d}=[0.28, 0.272,0.253,0.243,0.231,0.222,0.21,0.196,0.187]$, $\textbf{a} = [0.505,0.504,0.503,0.502,0.503,0.502,0.503,0.502,0.503]$, $\textbf{P} = [60,55,50,45,40,35,30,25,20]$, $\textbf{q} = [85,77,69,61,53,45,37,$$29,21]$, and different values of $B$ varying from $4$ to $8$.

Theorems & Definitions (10)

  • Remark 1
  • Remark 2
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Lemma 7
  • Theorem 1