Table of Contents
Fetching ...

Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning

Yimian Ding, Xinqi Wang, Jingzehua Xu, Guanwen Xie, Weiyi Liu, Yi Li

TL;DR

This work tackles IoUT data collection in turbulent ocean environments by formulating a multi-objective, multi-AUV problem and modeling it as an MDP. It introduces a semi-communication, decentralized training paradigm (SC-DTDE) and the Multi-Agent Independent Conservative Q-Learning (MAICQL) algorithm to train multiple AUVs offline from expert data. The approach improves data rate, VoI, and energy efficiency while ensuring collision avoidance, and demonstrates robustness to noise and scalability with two AUVs being optimal in tested scenarios. The framework offers a practical path toward robust, data-efficient underwater sensing using coordinated AUV fleets in challenging environments, with future work directed at sim2real validation.

Abstract

The Internet of Underwater Things (IoUT) offers significant potential for ocean exploration but encounters challenges due to dynamic underwater environments and severe signal attenuation. Current methods relying on Autonomous Underwater Vehicles (AUVs) based on online reinforcement learning (RL) lead to high computational costs and low data utilization. To address these issues and the constraints of turbulent ocean environments, we propose a multi-AUV assisted data collection framework for IoUT based on multi-agent offline RL. This framework maximizes data rate and the value of information (VoI), minimizes energy consumption, and ensures collision avoidance by utilizing environmental and equipment status data. We introduce a semi-communication decentralized training with decentralized execution (SC-DTDE) paradigm and a multi-agent independent conservative Q-learning algorithm (MAICQL) to effectively tackle the problem. Extensive simulations demonstrate the high applicability, robustness, and data collection efficiency of the proposed framework.

Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning

TL;DR

This work tackles IoUT data collection in turbulent ocean environments by formulating a multi-objective, multi-AUV problem and modeling it as an MDP. It introduces a semi-communication, decentralized training paradigm (SC-DTDE) and the Multi-Agent Independent Conservative Q-Learning (MAICQL) algorithm to train multiple AUVs offline from expert data. The approach improves data rate, VoI, and energy efficiency while ensuring collision avoidance, and demonstrates robustness to noise and scalability with two AUVs being optimal in tested scenarios. The framework offers a practical path toward robust, data-efficient underwater sensing using coordinated AUV fleets in challenging environments, with future work directed at sim2real validation.

Abstract

The Internet of Underwater Things (IoUT) offers significant potential for ocean exploration but encounters challenges due to dynamic underwater environments and severe signal attenuation. Current methods relying on Autonomous Underwater Vehicles (AUVs) based on online reinforcement learning (RL) lead to high computational costs and low data utilization. To address these issues and the constraints of turbulent ocean environments, we propose a multi-AUV assisted data collection framework for IoUT based on multi-agent offline RL. This framework maximizes data rate and the value of information (VoI), minimizes energy consumption, and ensures collision avoidance by utilizing environmental and equipment status data. We introduce a semi-communication decentralized training with decentralized execution (SC-DTDE) paradigm and a multi-agent independent conservative Q-learning algorithm (MAICQL) to effectively tackle the problem. Extensive simulations demonstrate the high applicability, robustness, and data collection efficiency of the proposed framework.

Paper Structure

This paper contains 18 sections, 27 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of multi-AUV assisted IoUT data collection system.
  • Figure 2: The curves of the cumulative reward, sum data rate, and sum VoI under different noise intensities: (a) Cumulative reward. (b) Sum data rate. (c) Sum VoI.
  • Figure 3: Performance comparison of MAISAC, BC, GAIL and MAICQL algorithms.
  • Figure 4: Trajectories of AUVs for the data collection task in the turbulence-free environment.
  • Figure 5: Trajectories of AUVs for the data collection task in the turbulent environment.