Causality-Driven Reinforcement Learning for Joint Communication and Sensing

Anik Roy; Serene Banerjee; Jishnu Sadasivan; Arnab Sarkar; Soumyajit Dey

Causality-Driven Reinforcement Learning for Joint Communication and Sensing

Anik Roy, Serene Banerjee, Jishnu Sadasivan, Arnab Sarkar, Soumyajit Dey

TL;DR

This work tackles the challenge of learning high-dimensional beam patterns in joint communication and sensing (JCAS) for mMIMO systems by introducing a causality-aware reinforcement learning framework. The authors deploy a state-wise action refinement mechanism (TD3-INVASE) to discover and prune relevant action dimensions, enabling efficient exploration of the beam codebook for both communication and sensing beams. They extend a three-stage communication design and a three-stage sensing design with a causal discovery component, achieving higher beamforming gains and improved sample efficiency over baselines in DeepMIMO-generated scenarios. The approach supports online adaptation and potential generalization via causal reasoning, with practical impact on low-overhead, high-performance JCAS in dynamic wireless environments.

Abstract

The next-generation wireless network, 6G and beyond, envisions to integrate communication and sensing to overcome interference, improve spectrum efficiency, and reduce hardware and power consumption. Massive Multiple-Input Multiple Output (mMIMO)-based Joint Communication and Sensing (JCAS) systems realize this integration for 6G applications such as autonomous driving, as it requires accurate environmental sensing and time-critical communication with neighboring vehicles. Reinforcement Learning (RL) is used for mMIMO antenna beamforming in the existing literature. However, the huge search space for actions associated with antenna beamforming causes the learning process for the RL agent to be inefficient due to high beam training overhead. The learning process does not consider the causal relationship between action space and the reward, and gives all actions equal importance. In this work, we explore a causally-aware RL agent which can intervene and discover causal relationships for mMIMO-based JCAS environments, during the training phase. We use a state dependent action dimension selection strategy to realize causal discovery for RL-based JCAS. Evaluation of the causally-aware RL framework in different JCAS scenarios shows the benefit of our proposed framework over baseline methods in terms of the beamforming gain.

Causality-Driven Reinforcement Learning for Joint Communication and Sensing

TL;DR

Abstract

Paper Structure (25 sections, 14 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 14 equations, 9 figures, 4 tables, 1 algorithm.

INTRODUCTION
BACKGROUND
PROBLEM FORMULATION
DRL BEAM CODEBOOK LEARNING FOR JCAS
COMMUNICATION BEAM DESIGN
STAGE 1
STAGE 2
STAGE 3
SENSING BEAM DESIGN
STAGE 1
STAGE 2
STAGE 3
CAUSAL DRL FOR BEAM PATTERN DESIGN
CURRICULUM FOR HIGH DIMENSIONAL ACTION SELECTION
ITERATIVE SELECTION OF ACTIONS
...and 10 more sections

Figures (9)

Figure 1: A mMIMO base station with antenna array of $M$ antennas using beam codebook to serve communication users and sense targets in vicinity
Figure 2: The proposed beam codebook design framework for mMIMO JCAS using DRL. The codebook design framework for communication has been presented in Zhang2021. We extend that model to JCAS in our proposed framework by adding beam design stages for sensing.
Figure 3: Block diagram of the INVASE architecture present in each DRL agent for beam pattern learning Sun2022. The capture of cause-effect relationship can also be augmented by domain knowledge.
Figure 4: Top view of the dynamic scenario for training the deep reinforcement learning agents DeepMIMOwebsite. The active base station and user grid for generating the training data are indicated with red boxes.
Figure 5: Average episodic beamforming gain for communication beam in scene 0.
...and 4 more figures

Causality-Driven Reinforcement Learning for Joint Communication and Sensing

TL;DR

Abstract

Causality-Driven Reinforcement Learning for Joint Communication and Sensing

Authors

TL;DR

Abstract

Table of Contents

Figures (9)