Digital Twin Enhanced Deep Reinforcement Learning for Intelligent Omni-Surface Configurations in MU-MIMO Systems

Xiaowen Ye; Xianghao Yu; Liqun Fu

Digital Twin Enhanced Deep Reinforcement Learning for Intelligent Omni-Surface Configurations in MU-MIMO Systems

Xiaowen Ye, Xianghao Yu, Liqun Fu

TL;DR

This work tackles real-time IOS configuration in IOS-assisted MU-MIMO by removing reliance on perfect sub-channel CSI and UE mobility. It introduces DeepIOS, a DRL-based IOS controller formulated as a POMDP, enhanced by an action-branch architecture to parallelize phase-shift and amplitude optimization. A digital twin module is built via supervised learning to pre-validate IOS settings and accelerate training, enabling rapid policy improvement through three closed loops that couple digital and physical spaces. Compared with random and MAB baselines, DeepIOS achieves higher sum-rate, faster convergence, and robustness across varying channel conditions, with the digital twin offering substantial run-time reductions. The framework lays groundwork for real-time, model-free IOS control and points to future work on joint BS and IOS beamforming and broader deployment in dynamic wireless networks.

Abstract

Intelligent omni-surface (IOS) is a promising technique to enhance the capacity of wireless networks, by reflecting and refracting the incident signal simultaneously. Traditional IOS configuration schemes, relying on all sub-channels' channel state information and user equipments' mobility, are difficult to implement in complex realistic systems. Existing works attempt to address this issue employing deep reinforcement learning (DRL), but this method requires a lot of trial-and-error interactions with the external environment for efficient results and thus cannot satisfy the real-time decision-making. To enable model-free and real-time IOS control, this paper puts forth a new framework that integrates DRL and digital twins. DeepIOS, a DRL based IOS configuration scheme with the goal of maximizing the sum data rate, is first developed to jointly optimize the phase-shift and amplitude of IOS in multi-user multiple-input-multiple-output systems. Thereafter, to further reduce the computational complexity, DeepIOS introduces an action branch architecture, which separately decides two optimization variables in parallel. Finally, a digital twin module is constructed through supervised learning as a pre-verification platform for DeepIOS, such that the decision-making's real-time can be guaranteed. The formulated framework is a closed-loop system, in which the physical space provides data to establish and calibrate the digital space, while the digital space generates experience samples for DeepIOS training and sends the trained parameters to the IOS controller for configurations. Numerical results show that compared with random and MAB schemes, the proposed framework attains a higher data rate and is more robust to different settings. Furthermore, the action branch architecture reduces DeepIOS's computational complexity, and the digital twin module improves the convergence speed and run-time.

Digital Twin Enhanced Deep Reinforcement Learning for Intelligent Omni-Surface Configurations in MU-MIMO Systems

TL;DR

Abstract

Digital Twin Enhanced Deep Reinforcement Learning for Intelligent Omni-Surface Configurations in MU-MIMO Systems

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)