Doubly-Dynamic ISAC Precoding for Vehicular Networks: A Constrained Deep Reinforcement Learning (CDRL) Approach

Zonghui Yang; Shijian Gao; Xiang Cheng

Doubly-Dynamic ISAC Precoding for Vehicular Networks: A Constrained Deep Reinforcement Learning (CDRL) Approach

Zonghui Yang, Shijian Gao, Xiang Cheng

TL;DR

This work proposes using constrained deep reinforcement learning to facilitate dynamic updates to the ISAC precoder using the primal dual-deep deterministic policy gradient and Wolpertinger architecture, which has superiority over existing candidates and is validated through experiments.

Abstract

Integrated sensing and communication (ISAC) technology is essential for supporting vehicular networks. However, the communication channel in this scenario exhibits time variations, and the potential targets may move rapidly, resulting in double dynamics. This nature poses a challenge for real-time precoder design. While optimization-based solutions are widely researched, they are complex and heavily rely on perfect channel-related information, which is impractical in double dynamics. To address this challenge, we propose using constrained deep reinforcement learning to facilitate dynamic updates to the ISAC precoder. Additionally, the primal dual-deep deterministic policy gradient and Wolpertinger architecture are tailored to efficiently train the algorithm under complex constraints and varying numbers of users. The proposed scheme not only adapts to the dynamics based on observations but also leverages environmental information to enhance performance and reduce complexity. Its superiority over existing candidates has been validated through experiments.

Doubly-Dynamic ISAC Precoding for Vehicular Networks: A Constrained Deep Reinforcement Learning (CDRL) Approach

TL;DR

Abstract

Paper Structure (18 sections, 21 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 18 sections, 21 equations, 4 figures, 1 table, 1 algorithm.

Introduction
System Model
Transmitted Waveform
Communication Model
Sensing Model
Optimization-based ISAC precoding with perfect prior
Constrained DRL-based Precoding for ISAC without perfect prior
Constrained Markov Decision Process Formulation
State Space
Action space
Reward and cost functions
State transition
Learning Algorithm
Learning architecture
Action selection
...and 3 more sections

Figures (4)

Figure 1: An illustration of the ISAC system in doubly-dynamic scenarios.
Figure 2: The proposed CDRL-based framework for ISAC precoding.
Figure 3: Cumulative reward and cost versus epochs.
Figure 4: ISAC performance comparison among various schemes.

Doubly-Dynamic ISAC Precoding for Vehicular Networks: A Constrained Deep Reinforcement Learning (CDRL) Approach

TL;DR

Abstract

Doubly-Dynamic ISAC Precoding for Vehicular Networks: A Constrained Deep Reinforcement Learning (CDRL) Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (4)