Table of Contents
Fetching ...

Delay-Constrained Grant-Free Random Access in MIMO Systems: Distributed Pilot Allocation and Power Control

Jianan Bai, Zheng Chen, Erik. G. Larsson

TL;DR

This work develops a distributed, cross-layer policy that allows the users to dynamically and independently choose their pilots and transmit powers to achieve a high effective sum throughput with fairness consideration, and proposes a deep learning-based, multi-agent control framework with centralized training and distributed execution.

Abstract

We study a delay-constrained grant-free random access system with a multi-antenna base station. The users randomly generate data packets with expiration deadlines, which are then transmitted from data queues on a first-in first-out basis. To deliver a packet, a user needs to succeed in both random access phase (sending a pilot without collision) and data transmission phase (achieving a required data rate with imperfect channel information) before the packet expires. We develop a distributed, cross-layer policy that allows the users to dynamically and independently choose their pilots and transmit powers to achieve a high effective sum throughput with fairness consideration. Our policy design involves three key components: 1) a proxy of the instantaneous data rate that depends only on macroscopic environment variables and transmission decisions, considering pilot collisions and imperfect channel estimation; 2) a quantitative, instantaneous measure of fairness within each communication round; and 3) a deep learning-based, multi-agent control framework with centralized training and distributed execution. The proposed framework benefits from an accurate, differentiable objective function for training, thereby achieving a higher sample efficiency compared with a conventional application of model-free, multi-agent reinforcement learning algorithms. The performance of the proposed approach is verified by simulations under highly dynamic and heterogeneous scenarios.

Delay-Constrained Grant-Free Random Access in MIMO Systems: Distributed Pilot Allocation and Power Control

TL;DR

This work develops a distributed, cross-layer policy that allows the users to dynamically and independently choose their pilots and transmit powers to achieve a high effective sum throughput with fairness consideration, and proposes a deep learning-based, multi-agent control framework with centralized training and distributed execution.

Abstract

We study a delay-constrained grant-free random access system with a multi-antenna base station. The users randomly generate data packets with expiration deadlines, which are then transmitted from data queues on a first-in first-out basis. To deliver a packet, a user needs to succeed in both random access phase (sending a pilot without collision) and data transmission phase (achieving a required data rate with imperfect channel information) before the packet expires. We develop a distributed, cross-layer policy that allows the users to dynamically and independently choose their pilots and transmit powers to achieve a high effective sum throughput with fairness consideration. Our policy design involves three key components: 1) a proxy of the instantaneous data rate that depends only on macroscopic environment variables and transmission decisions, considering pilot collisions and imperfect channel estimation; 2) a quantitative, instantaneous measure of fairness within each communication round; and 3) a deep learning-based, multi-agent control framework with centralized training and distributed execution. The proposed framework benefits from an accurate, differentiable objective function for training, thereby achieving a higher sample efficiency compared with a conventional application of model-free, multi-agent reinforcement learning algorithms. The performance of the proposed approach is verified by simulations under highly dynamic and heterogeneous scenarios.

Paper Structure

This paper contains 27 sections, 1 theorem, 58 equations, 7 figures, 1 table.

Key Result

Proposition 1

Figures (7)

  • Figure 1: An illustration of the mapping defined by the fairness promoting function in \ref{['S1']}.
  • Figure 2: The structure of the policy network.
  • Figure 3: Performance comparison (averaged over 8 independent trials and over every 10 epochs).
  • Figure 4: Normalized packet drop rate per user (single trial, averaged over 10 epochs). The requirement line represents $\overline{D}_i/D_i^{\textup{th}} = 1$.
  • Figure 5: Visualization of the learned policy.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Definition 1
  • Proposition 1
  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4