Table of Contents
Fetching ...

Deep Reinforcement Learning for Joint Time and Power Management in SWIPT-EH CIoT

Nadia Abdolkhani, Nada Abdel Khalek, Walaa Hamouda, Iyad Dayoub

TL;DR

The paper addresses joint time-switching and transmit power control for SWIPT-enabled CIoT in an underlay cognitive radio setting, modeled as a model-free MDP. It proposes a lightweight DDQN with UCB exploration to learn policies that adapt the TS ratio and power in real time. The method balances long-term throughput with energy and interference constraints, demonstrating superior performance over existing DRL baselines in simulations. This approach enhances the viability of energy-constrained CIoT deployments by improving throughput and reliability under realistic EH and spectrum-sharing conditions.

Abstract

This letter presents a novel deep reinforcement learning (DRL) approach for joint time allocation and power control in a cognitive Internet of Things (CIoT) system with simultaneous wireless information and power transfer (SWIPT). The CIoT transmitter autonomously manages energy harvesting (EH) and transmissions using a learnable time switching factor while optimizing power to enhance throughput and lifetime. The joint optimization is modeled as a Markov decision process under small-scale fading, realistic EH, and interference constraints. We develop a double deep Q-network (DDQN) enhanced with an upper confidence bound. Simulations benchmark our approach, showing superior performance over existing DRL methods.

Deep Reinforcement Learning for Joint Time and Power Management in SWIPT-EH CIoT

TL;DR

The paper addresses joint time-switching and transmit power control for SWIPT-enabled CIoT in an underlay cognitive radio setting, modeled as a model-free MDP. It proposes a lightweight DDQN with UCB exploration to learn policies that adapt the TS ratio and power in real time. The method balances long-term throughput with energy and interference constraints, demonstrating superior performance over existing DRL baselines in simulations. This approach enhances the viability of energy-constrained CIoT deployments by improving throughput and reliability under realistic EH and spectrum-sharing conditions.

Abstract

This letter presents a novel deep reinforcement learning (DRL) approach for joint time allocation and power control in a cognitive Internet of Things (CIoT) system with simultaneous wireless information and power transfer (SWIPT). The CIoT transmitter autonomously manages energy harvesting (EH) and transmissions using a learnable time switching factor while optimizing power to enhance throughput and lifetime. The joint optimization is modeled as a Markov decision process under small-scale fading, realistic EH, and interference constraints. We develop a double deep Q-network (DDQN) enhanced with an upper confidence bound. Simulations benchmark our approach, showing superior performance over existing DRL methods.

Paper Structure

This paper contains 8 sections, 6 equations, 2 figures, 1 algorithm.

Figures (2)

  • Figure 1: The proposed DDQN-UCB algorithm
  • Figure 2: (a) Benchmarking the ASR performance of our DDQN-UCB strategy in comparison to the existing strategies in the literaturetashma_drlIoT_2024_Nada_NadiaZarif_2022_dueling_vanilla_DQN, (b) Illustrating the effect of varying the number of slots occupied by PU $L$ and the number of time slots $T$ on our proposed DDQN-UCB strategy, and (c) Presenting the impact of varying the initial battery level $B_0$ and the duration of each time slot $\tau$ on our proposed DDQN-UCB strategy.