Multi-AUV Cooperative Underwater Multi-Target Tracking Based on Dynamic-Switching-enabled Multi-Agent Reinforcement Learning
Shengbo Wang, Chuan Lin, Guangjie Han, Shengchao Zhu, Zhixian Li, Zhenyu Wang, Yunpeng Ma
TL;DR
The paper tackles the challenge of robust multi-target tracking by AUV swarms in dynamically varying underwater environments with limited communication. It introduces a hierarchical software-defined MARL framework (HSARL) and a Dynamic-Switching-based MARL (DSBM) that integrates Dynamic-Switching Attention, Dynamic-Switching Resampling, and reward reshaping within an SDN-enabled MDP setting to enhance learning efficiency and tracking accuracy. AUV formation is managed via ASMA, a fuzzy-rule-augmented method that assigns ET-AUVs to targets amid ocean-current interference, boosting scalability and robustness. Evaluations in 3D simulations show that DSBM achieves faster convergence and higher tracking accuracy than several baselines, with reward reshaping further accelerating early learning and ASMA improving formation allocation under currents, indicating practical potential for scalable, resilient underwater surveillance and tracking systems.
Abstract
In recent years, autonomous underwater vehicle (AUV) swarms are gradually becoming popular and have been widely promoted in ocean exploration or underwater tracking, etc. In this paper, we propose a multi-AUV cooperative underwater multi-target tracking algorithm especially when the real underwater factors are taken into account. We first give normally modelling approach for the underwater sonar-based detection and the ocean current interference on the target tracking process. Then, based on software-defined networking (SDN), we regard the AUV swarm as a underwater ad-hoc network and propose a hierarchical software-defined multi-AUV reinforcement learning (HSARL) architecture. Based on the proposed HSARL architecture, we propose the "Dynamic-Switching" mechanism, it includes "Dynamic-Switching Attention" and "Dynamic-Switching Resampling" mechanisms which accelerate the HSARL algorithm's convergence speed and effectively prevents it from getting stuck in a local optimum state. Additionally, we introduce the reward reshaping mechanism for further accelerating the convergence speed of the proposed HSARL algorithm in early phase. Finally, based on a proposed AUV classification method, we propose a cooperative tracking algorithm called Dynamic-Switching-Based MARL (DSBM)-driven tracking algorithm. Evaluation results demonstrate that our proposed DSBM tracking algorithm can perform precise underwater multi-target tracking, comparing with many of recent research products in terms of various important metrics.
