Table of Contents
Fetching ...

DRL-Based Spectrum Sharing for RIS-Aided Local High-Quality Wireless Networks

Hamid Reza Hashempour, Mina Khadem, Eduard A. Jorswieck

Abstract

This paper investigates a smart spectrum-sharing framework for reconfigurable intelligent surface (RIS)-aided local high-quality wireless networks (LHQWNs) within a mobile network operator (MNO) ecosystem. Although RISs are often considered potentially harmful due to interference, this work shows that properly controlled RISs can enhance the quality of service (QoS). The proposed system enables temporary spectrum access for multiple vertical service providers (VSPs) by dynamically allocating radio resources according to traffic demand. The spectrum is divided into dedicated subchannels assigned to individual VSPs and reusable subchannels shared among multiple VSPs, while RIS is employed to improve propagation conditions. We formulate a multi-VSP utility maximization problem that jointly optimizes subchannel assignment, transmit power, and RIS phase configuration while accounting for spectrum access costs, RIS leasing costs, and QoS constraints. The resulting mixed-integer non-linear program (MINLP) is intractable using conventional optimization methods. To address this challenge, the problem is modeled as a Markov decision process (MDP) and solved using deep reinforcement learning (DRL). Specifically, deep deterministic policy gradient (DDPG) and soft actor-critic (SAC) algorithms are developed and compared. Simulation results show that SAC outperforms DDPG in convergence speed, stability, and achievable utility, reaching up to 96% of the exhaustive search benchmark and demonstrating the potential of RIS to improve overall utility in multi-VSP scenarios.

DRL-Based Spectrum Sharing for RIS-Aided Local High-Quality Wireless Networks

Abstract

This paper investigates a smart spectrum-sharing framework for reconfigurable intelligent surface (RIS)-aided local high-quality wireless networks (LHQWNs) within a mobile network operator (MNO) ecosystem. Although RISs are often considered potentially harmful due to interference, this work shows that properly controlled RISs can enhance the quality of service (QoS). The proposed system enables temporary spectrum access for multiple vertical service providers (VSPs) by dynamically allocating radio resources according to traffic demand. The spectrum is divided into dedicated subchannels assigned to individual VSPs and reusable subchannels shared among multiple VSPs, while RIS is employed to improve propagation conditions. We formulate a multi-VSP utility maximization problem that jointly optimizes subchannel assignment, transmit power, and RIS phase configuration while accounting for spectrum access costs, RIS leasing costs, and QoS constraints. The resulting mixed-integer non-linear program (MINLP) is intractable using conventional optimization methods. To address this challenge, the problem is modeled as a Markov decision process (MDP) and solved using deep reinforcement learning (DRL). Specifically, deep deterministic policy gradient (DDPG) and soft actor-critic (SAC) algorithms are developed and compared. Simulation results show that SAC outperforms DDPG in convergence speed, stability, and achievable utility, reaching up to 96% of the exhaustive search benchmark and demonstrating the potential of RIS to improve overall utility in multi-VSP scenarios.

Paper Structure

This paper contains 38 sections, 59 equations, 8 figures, 3 tables, 2 algorithms.

Figures (8)

  • Figure 1: Functional use case for the integration of local high-quality wireless networks into an MNO ecosystem as service network areas.
  • Figure 2: System model of an RIS-assisted multi-VSP wireless network within an MNO ecosystem with dedicated and reusable subchannels.
  • Figure 3: Simulation geometry for a realization with $|\mathcal{B}_v|=2$, $K_v=3$ and $J=1$. Users and BSs and RIS are randomly distributed within each VSP region.
  • Figure 4: Convergence comparison of SAC and DDPG against the EDS benchmark in terms of the average sum utility of the VSPs for $M_1=4$ and $M_1=16$. Solid curves denote the mean reward over different seeds.
  • Figure 5: Convergence comparison of DRL under different configurations of reusable and dedicated subchannels for a fixed number of subchannels per VSP, $C_v=4$.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Remark