Table of Contents
Fetching ...

Trustworthy AI-Driven Dynamic Hybrid RIS: Joint Optimization and Reward Poisoning-Resilient Control in Cognitive MISO Networks

Deemah H. Tashman, Soumaya Cherkaoui

Abstract

Cognitive radio networks (CRNs) are a key mechanism for alleviating spectrum scarcity by enabling secondary users (SUs) to opportunistically access licensed frequency bands without harmful interference to primary users (PUs). To address unreliable direct SU links and energy constraints common in next-generation wireless networks, this work introduces an adaptive, energy-aware hybrid reconfigurable intelligent surface (RIS) for underlay multiple-input single-output (MISO) CRNs. Distinct from prior approaches relying on static RIS architectures, our proposed RIS dynamically alternates between passive and active operation modes in real time according to harvested energy availability. We also model our scenario under practical hardware impairments and cascaded fading channels. We formulate and solve a joint transmit beamforming and RIS phase optimization problem via the soft actor-critic (SAC) deep reinforcement learning (DRL) method, leveraging its robustness in continuous and highly dynamic environments. Notably, we conduct the first systematic study of reward poisoning attacks on DRL agents in RIS-enhanced CRNs, and propose a lightweight, real-time defense based on reward clipping and statistical anomaly filtering. Numerical results demonstrate that the SAC-based approach consistently outperforms established DRL baselines, and that the dynamic hybrid RIS strikes a superior trade-off between throughput and energy consumption compared to fully passive and fully active alternatives. We further show the effectiveness of our defense in maintaining SU performance even under adversarial conditions. Our results advance the practical and secure deployment of RIS-assisted CRNs, and highlight crucial design insights for energy-constrained wireless systems.

Trustworthy AI-Driven Dynamic Hybrid RIS: Joint Optimization and Reward Poisoning-Resilient Control in Cognitive MISO Networks

Abstract

Cognitive radio networks (CRNs) are a key mechanism for alleviating spectrum scarcity by enabling secondary users (SUs) to opportunistically access licensed frequency bands without harmful interference to primary users (PUs). To address unreliable direct SU links and energy constraints common in next-generation wireless networks, this work introduces an adaptive, energy-aware hybrid reconfigurable intelligent surface (RIS) for underlay multiple-input single-output (MISO) CRNs. Distinct from prior approaches relying on static RIS architectures, our proposed RIS dynamically alternates between passive and active operation modes in real time according to harvested energy availability. We also model our scenario under practical hardware impairments and cascaded fading channels. We formulate and solve a joint transmit beamforming and RIS phase optimization problem via the soft actor-critic (SAC) deep reinforcement learning (DRL) method, leveraging its robustness in continuous and highly dynamic environments. Notably, we conduct the first systematic study of reward poisoning attacks on DRL agents in RIS-enhanced CRNs, and propose a lightweight, real-time defense based on reward clipping and statistical anomaly filtering. Numerical results demonstrate that the SAC-based approach consistently outperforms established DRL baselines, and that the dynamic hybrid RIS strikes a superior trade-off between throughput and energy consumption compared to fully passive and fully active alternatives. We further show the effectiveness of our defense in maintaining SU performance even under adversarial conditions. Our results advance the practical and secure deployment of RIS-assisted CRNs, and highlight crucial design insights for energy-constrained wireless systems.

Paper Structure

This paper contains 21 sections, 10 equations, 12 figures, 4 tables, 2 algorithms.

Figures (12)

  • Figure 1: The system model.
  • Figure 2: SUs' sum rate for different RIS reflecting elements $(R)$. $A=4$, $B=4$, $P_{t}=30$ dB, $I=20$ dB, and $\kappa_s=1$, $\kappa_b=1$.
  • Figure 3: SUs' sum rate for different cascade levels $(\kappa)$, SU transmitter antennas ($A$), and SU receivers $(B)$. $R=30$, $P_{t}=30$ dB, $I=20$ dB, and $\kappa_s=\kappa_b=\kappa$.
  • Figure 4: SUs' average rate (reward) for different values of the RIS minimum reflection amplitude $(\beta_m)$ and the maximum transmission power available at the SU transmitter $(P_t)$. $\kappa_s=\kappa_b=1$, and $I=20$ dB.
  • Figure 5: A comparison between the SUs' average rate for the SAC, TD3, DDPG, and Random policies. $P_{t}=1$ dB, $\kappa_s=\kappa_b=1$, $I=1$ dB, and $\beta_m=0.6$.
  • ...and 7 more figures