Energy-Efficient Irregular RIS-aided UAV-Assisted Optimization: A Deep Reinforcement Learning Approach
Mahmoud M. Salim, Khaled M. Rabie, Ali H. Muqaibel
TL;DR
This work tackles energy-efficient UAV-assisted communication via an irregular RIS that can ON/OFF for energy harvesting. It proposes a Hybrid Energy-Harvesting Resource Allocation (HERA) strategy leveraging irregular RIS elements and a nonlinear time-switching RF energy model with renewable energy arrivals, formulated as a non-convex MINLP. To solve this, the authors develop EE-DDPG, a dual-actor, dual-critic DRL framework with action clipping and softmax-weighted Q-value estimation, capable of adapting to mobility and hardware impairments. Simulation results show substantial EH efficiency gains (up to 81.5% in single-user and 73.2% in multi-user scenarios) and superior performance over baseline DRL methods, validating the practicality and sustainability benefits for UAV-RIS deployments. The approach offers a scalable, intelligent solution for extending UAV mission duration while maintaining QoS in dynamic wireless environments.
Abstract
Reconfigurable intelligent surfaces (RISs) enhance unmanned aerial vehicles (UAV)-assisted communication by extending coverage, improving efficiency, and enabling adaptive beamforming. This paper investigates a multiple-input single-output system where a base station (BS) communicates with multiple single-antenna users through a UAV-assisted RIS, dynamically adapting to user mobility to maintain seamless connectivity. To extend UAV-RIS operational time, we propose a hybrid energy-harvesting resource allocation (HERA) strategy that leverages the irregular RIS ON/OFF capability while adapting to BS-RIS and RIS-user channels. The HERA strategy dynamically allocates resources by integrating non-linear radio frequency energy harvesting (EH) based on the time-switching (TS) approach and renewable energy as a complementary source. A non-convex mixed-integer nonlinear programming problem is formulated to maximize EH efficiency while satisfying quality-of-service, power, and energy constraints under channel state information and hardware impairments. The optimization jointly considers BS transmit power, RIS phase shifts, TS factor, and RIS element selection as decision variables. To solve this problem, we introduce the energy-efficient deep deterministic policy gradient (EE-DDPG) algorithm. This deep reinforcement learning (DRL)-based approach integrates action clipping and softmax-weighted Q-value estimation to mitigate estimation errors. Simulation results demonstrate that the proposed HERA method significantly improves EH efficiency, reaching up to 81.5\% and 73.2\% in single-user and multi-user scenarios, respectively, contributing to extended UAV operational time. Additionally, the proposed EE-DDPG model outperforms existing DRL algorithms while maintaining practical computational complexity.
