Table of Contents
Fetching ...

Multi-AUV Trajectory Learning for Sustainable Underwater IoT with Acoustic Energy Transfer

Mohamed Afouene Melki, Mohammad Shehab, Mohamed-Slim Alouini

Abstract

The Internet of Underwater Things (IoUT) supports ocean sensing and offshore monitoring but requires coordinated mobility and energy-aware communication to sustain long-term operation. This letter proposes a multi-AUV framework that jointly addresses trajectory control and acoustic communication for sustainable IoUT operation. The problem is formulated as a Markov decision process that integrates continuous AUV kinematics, propulsion-aware energy consumption, acoustic energy transfer feasibility, and Age of Information (AoI) regulation. A centralized deep reinforcement learning policy based on Proximal Policy Optimization (PPO) is developed to coordinate multiple AUVs under docking and safety constraints. The proposed approach is evaluated against structured heuristic baselines and demonstrates significant reductions in average AoI while improving fairness and data collection efficiency. Results show that cooperative multi-AUV control provides scalable performance gains as the network size increases.

Multi-AUV Trajectory Learning for Sustainable Underwater IoT with Acoustic Energy Transfer

Abstract

The Internet of Underwater Things (IoUT) supports ocean sensing and offshore monitoring but requires coordinated mobility and energy-aware communication to sustain long-term operation. This letter proposes a multi-AUV framework that jointly addresses trajectory control and acoustic communication for sustainable IoUT operation. The problem is formulated as a Markov decision process that integrates continuous AUV kinematics, propulsion-aware energy consumption, acoustic energy transfer feasibility, and Age of Information (AoI) regulation. A centralized deep reinforcement learning policy based on Proximal Policy Optimization (PPO) is developed to coordinate multiple AUVs under docking and safety constraints. The proposed approach is evaluated against structured heuristic baselines and demonstrates significant reductions in average AoI while improving fairness and data collection efficiency. Results show that cooperative multi-AUV control provides scalable performance gains as the network size increases.

Paper Structure

This paper contains 19 sections, 25 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: PPO-based interaction between AUVs and the environment
  • Figure 2: Performance comparison of the proposed and benchmark schemes for different network sizes.
  • Figure 3: AUV trajectories and total collected data for a network with 7 IoUT nodes under different scheduling strategies.
  • Figure 4: AUV speed and heading evolution for a network with 7 IoUT nodes underand PPO-based RL scheduling using a single AUV.