Foundation Model-Aided Hierarchical Deep Reinforcement Learning for Blockage-Aware Link in RIS-Assisted Networks
Mohammad Ghassemi, Han Zhang, Ali Afana, Akram Bin Sediq, Melike Erol-Kantarci
TL;DR
This work tackles the blockages and high-dimensional control challenges in RIS-assisted wireless networks by integrating a fine-tuned large wireless model (LWM) with a two-level hierarchical deep reinforcement learning (HDRL) agent. The FM-HDRL framework uses LWM-derived low-dimensional channel embeddings to inform a meta-controller (high-level mode selection) and a sub-controller (low-level beamforming and RIS phase-shifts) that jointly maximize spectral efficiency. Key contributions include fine-tuning an open-source LWM for channel representation, designing a two-tier HDRL that aligns with slow and fast channel dynamics, and demonstrating faster convergence and higher SE than FM-DRL and beam sweeping, with favorable scalability as RIS size grows. The approach shows potential for deployment in centralized RAN controllers, enabling efficient, channel-aware RIS optimization in dynamic 6G environments.
Abstract
Reconfigurable intelligent surface (RIS) technology has the potential to significantly enhance the spectral efficiency (SE) of 6G wireless networks. However, practical deployment remains constrained by challenges in accurate channel estimation and control optimization under dynamic conditions. This paper presents a foundation model-aided hierarchical deep reinforcement learning (FM-HDRL) framework designed for joint beamforming and phase-shift optimization in RIS-assisted wireless networks. To implement this, we first fine-tune a pre-trained large wireless model (LWM) to translate raw channel data into low-dimensional, context-aware channel state information (CSI) embeddings. Next, these embeddings are combined with user location information and blockage status to select the optimal communication path. The resulting features are then fed into an HDRL model, assumed to be implemented at a centralized controller, which jointly optimizes the base station (BS) beamforming vectors and the RIS phase-shift configurations to maximize SE. Simulation results demonstrate that the proposed FM-HDRL framework consistently outperforms baseline methods in terms of convergence speed, spectral efficiency, and scalability. According to the simulation results, our proposed method improves 7.82% SE compared to the FM-aided deep reinforcement learning (FM-DRL) approach and a substantial enhancement of about 48.66% relative to the beam sweeping approach.
