Deep Reinforcement Learning for Joint Time and Power Management in SWIPT-EH CIoT
Nadia Abdolkhani, Nada Abdel Khalek, Walaa Hamouda, Iyad Dayoub
TL;DR
The paper addresses joint time-switching and transmit power control for SWIPT-enabled CIoT in an underlay cognitive radio setting, modeled as a model-free MDP. It proposes a lightweight DDQN with UCB exploration to learn policies that adapt the TS ratio and power in real time. The method balances long-term throughput with energy and interference constraints, demonstrating superior performance over existing DRL baselines in simulations. This approach enhances the viability of energy-constrained CIoT deployments by improving throughput and reliability under realistic EH and spectrum-sharing conditions.
Abstract
This letter presents a novel deep reinforcement learning (DRL) approach for joint time allocation and power control in a cognitive Internet of Things (CIoT) system with simultaneous wireless information and power transfer (SWIPT). The CIoT transmitter autonomously manages energy harvesting (EH) and transmissions using a learnable time switching factor while optimizing power to enhance throughput and lifetime. The joint optimization is modeled as a Markov decision process under small-scale fading, realistic EH, and interference constraints. We develop a double deep Q-network (DDQN) enhanced with an upper confidence bound. Simulations benchmark our approach, showing superior performance over existing DRL methods.
