Table of Contents
Fetching ...

DHEA-MECD: An Embodied Intelligence-Powered DRL Algorithm for AUV Tracking in Underwater Environments with High-Dimensional Features

Kai Tian, Chuan Lin, Guangjie Han, Chen An, Qian Zhu, Shengzhao Zhu, Zhenyu Wang

TL;DR

The paper tackles robust multi-target tracking for AUVs in underwater environments with high-dimensional, heterogeneous sensory inputs. It introduces a hierarchical Embodied Intelligence (EI) architecture and the DHEA-MECD algorithm, which combines a Double-Head Encoder-Attention framework with a Multi-Expert Collaborative Decision mechanism using a motion-stage-aware Top-k expert selection. Key contributions include structured state extraction from space/motion/barrier/noise subspaces, efficient hybrid action policy via expert collaboration and sparse updates, and a reward design that balances pursuit, safety, and constraints. Empirical results show superior convergence speed, tracking accuracy, and obstacle avoidance across increasing feature dimensionality, with ablations confirming the importance of both representation learning and collaborative decision-making. The work advances robust, efficient underwater tracking and sets the stage for future multi-AUV coordination and sim-to-real validation.

Abstract

In recent years, autonomous underwater vehicle (AUV) systems have demonstrated significant potential in complex marine exploration. However, effective AUV-based tracking remains challenging in realistic underwater environments characterized by high-dimensional features, including coupled kinematic states, spatial constraints, time-varying environmental disturbances, etc. To address these challenges, this paper proposes a hierarchical embodied-intelligence (EI) architecture for underwater multi-target tracking with AUVs in complex underwater environments. Built upon this architecture, we introduce the Double-Head Encoder-Attention-based Multi-Expert Collaborative Decision (DHEA-MECD), a novel Deep Reinforcement Learning (DRL) algorithm designed to support efficient and robust multi-target tracking. Specifically, in DHEA-MECD, a Double-Head Encoder-Attention-based information extraction framework is designed to semantically decompose raw sensory observations and explicitly model complex dependencies among heterogeneous features, including spatial configurations, kinematic states, structural constraints, and stochastic perturbations. On this basis, a motion-stage-aware multi-expert collaborative decision mechanism with Top-k expert selection strategy is introduced to support stage-adaptive decision-making. Furthermore, we propose the DHEA-MECD-based underwater multitarget tracking algorithm to enable AUV smart, stable, and anti-interference multi-target tracking. Extensive experimental results demonstrate that the proposed approach achieves superior tracking success rates, faster convergence, and improved motion optimality compared with mainstream DRL-based methods, particularly in complex and disturbance-rich marine environments.

DHEA-MECD: An Embodied Intelligence-Powered DRL Algorithm for AUV Tracking in Underwater Environments with High-Dimensional Features

TL;DR

The paper tackles robust multi-target tracking for AUVs in underwater environments with high-dimensional, heterogeneous sensory inputs. It introduces a hierarchical Embodied Intelligence (EI) architecture and the DHEA-MECD algorithm, which combines a Double-Head Encoder-Attention framework with a Multi-Expert Collaborative Decision mechanism using a motion-stage-aware Top-k expert selection. Key contributions include structured state extraction from space/motion/barrier/noise subspaces, efficient hybrid action policy via expert collaboration and sparse updates, and a reward design that balances pursuit, safety, and constraints. Empirical results show superior convergence speed, tracking accuracy, and obstacle avoidance across increasing feature dimensionality, with ablations confirming the importance of both representation learning and collaborative decision-making. The work advances robust, efficient underwater tracking and sets the stage for future multi-AUV coordination and sim-to-real validation.

Abstract

In recent years, autonomous underwater vehicle (AUV) systems have demonstrated significant potential in complex marine exploration. However, effective AUV-based tracking remains challenging in realistic underwater environments characterized by high-dimensional features, including coupled kinematic states, spatial constraints, time-varying environmental disturbances, etc. To address these challenges, this paper proposes a hierarchical embodied-intelligence (EI) architecture for underwater multi-target tracking with AUVs in complex underwater environments. Built upon this architecture, we introduce the Double-Head Encoder-Attention-based Multi-Expert Collaborative Decision (DHEA-MECD), a novel Deep Reinforcement Learning (DRL) algorithm designed to support efficient and robust multi-target tracking. Specifically, in DHEA-MECD, a Double-Head Encoder-Attention-based information extraction framework is designed to semantically decompose raw sensory observations and explicitly model complex dependencies among heterogeneous features, including spatial configurations, kinematic states, structural constraints, and stochastic perturbations. On this basis, a motion-stage-aware multi-expert collaborative decision mechanism with Top-k expert selection strategy is introduced to support stage-adaptive decision-making. Furthermore, we propose the DHEA-MECD-based underwater multitarget tracking algorithm to enable AUV smart, stable, and anti-interference multi-target tracking. Extensive experimental results demonstrate that the proposed approach achieves superior tracking success rates, faster convergence, and improved motion optimality compared with mainstream DRL-based methods, particularly in complex and disturbance-rich marine environments.
Paper Structure (20 sections, 40 equations, 8 figures, 2 tables, 2 algorithms)

This paper contains 20 sections, 40 equations, 8 figures, 2 tables, 2 algorithms.

Figures (8)

  • Figure 1: Proposed hierarchical EI architecture
  • Figure 2: Proposed information extraction framework for high-dimensional features in DHEA-MECD
  • Figure 3: Proposed Multi-Expert Collaborative Decision Mechanism in DHEA-MECD
  • Figure 4: Convergence speed comparison
  • Figure 5: Accuracy rate comparison
  • ...and 3 more figures