Reinforcement Learning for Secrecy Optimization in Underwater Energy Harvesting Relay Network

Shalini Tripathi; Ankur Bansal; Chinmoy Kundu

Reinforcement Learning for Secrecy Optimization in Underwater Energy Harvesting Relay Network

Shalini Tripathi, Ankur Bansal, Chinmoy Kundu

TL;DR

Simulation results show that the RL based OPA adapts effectively to battery dynamics, varying channel conditions, and optical link availability, achieving the highest secure data transmission, while GA performs reasonably and NA performs poorly due to its short-sighted decisions.

Abstract

This paper explores secure communication in an underwater energy-harvesting (EH) relay network that supports hybrid optical-acoustic transmission. The optical hop is modeled using a Gamma-Gamma turbulence channel with pointing errors and may occasionally be blocked by underwater obstacles. At the same time, an eavesdropper is assumed to monitor the acoustic hop, creating a secrecy concern. To address this, we formulate the relay power allocation problem as an infinite-horizon Markov decision process (MDP). A model-based reinforcement learning (RL) driven optimal power allocation (OPA) strategy is proposed to maximize long-term cumulative secrecy performance until the network stops functioning. To offer lower-complexity alternatives, we also develop a Greedy Algorithm (GA) and a Naive Algorithm (NA). Simulation results show that the RL based OPA adapts effectively to battery dynamics, varying channel conditions, and optical link availability, achieving the highest secure data transmission, while GA performs reasonably and NA performs poorly due to its short-sighted decisions.

Reinforcement Learning for Secrecy Optimization in Underwater Energy Harvesting Relay Network

TL;DR

Abstract

Paper Structure (14 sections, 19 equations, 4 figures, 2 algorithms)

This paper contains 14 sections, 19 equations, 4 figures, 2 algorithms.

Introduction
System Model
EH Model
UWO Channel Model for the $SR$ Link
UWA Channel Model for $RD$ and $RE$ Links
Performance Metric
Problem Formulation
Proposed Solutions
Optimal Power Allocation (OPA)
Greedy Algorithm (GA)
Naive Algorithm (NA)
Complexity Analysis
Results and Discussions
Conclusion

Figures (4)

Figure 1: System model for the considered underwater system.
Figure 2: Expected total discounted reward versus discount factor $\Gamma$ with different obstacle density $T_o$.
Figure 3: Expected total discounted reward versus EH probability $p$ when $E_R$ improves from $2$ to $4$.
Figure 4: Expected total discounted reward versus battery capacity $B_{R}^{\textrm{max}}$ when $l_{RE}$ changes from $5$ km to $6$ km.

Reinforcement Learning for Secrecy Optimization in Underwater Energy Harvesting Relay Network

TL;DR

Abstract

Reinforcement Learning for Secrecy Optimization in Underwater Energy Harvesting Relay Network

Authors

TL;DR

Abstract

Table of Contents

Figures (4)