Underwater Embodied Intelligence for Autonomous Robots: A Constraint-Coupled Perspective on Planning, Control, and Deployment

Jingzehua Xu; Guanwen Xie; Jiwei Tang; Shuai Zhang; Xiaofan Li

Underwater Embodied Intelligence for Autonomous Robots: A Constraint-Coupled Perspective on Planning, Control, and Deployment

Jingzehua Xu, Guanwen Xie, Jiwei Tang, Shuai Zhang, Xiaofan Li

Abstract

Autonomous underwater robots are increasingly deployed for environmental monitoring, infrastructure inspection, subsea resource exploration, and long-horizon exploration. Yet, despite rapid advances in learning-based planning and control, reliable autonomy in real ocean environments remains fundamentally constrained by tightly coupled physical limits. Hydrodynamic uncertainty, partial observability, bandwidth-limited communication, and energy scarcity are not independent challenges; they interact within the closed perception-planning-control loop and often amplify one another over time. This Review develops a constraint-coupled perspective on underwater embodied intelligence, arguing that planning and control must be understood within tightly coupled sensing, communication, coordination, and resource constraints in real ocean environments. We synthesize recent progress in reinforcement learning, belief-aware planning, hybrid control, multi-robot coordination, and foundation-model integration through this embodied perspective. Across representative application domains, we show how environmental monitoring, inspection, exploration, and cooperative missions expose distinct stress profiles of cross-layer coupling. To unify these observations, we introduce a cross-layer failure taxonomy spanning epistemic, dynamic, and coordination breakdowns, and analyze how errors cascade across autonomy layers under uncertainty. Building on this structure, we outline research directions toward physics-grounded world models, certifiable learning-enabled control, communication-aware coordination, and deployment-aware system design. By internalizing constraint coupling rather than treating it as an external disturbance, underwater embodied intelligence may evolve from performance-driven adaptation toward resilient, scalable, and verifiable autonomy under real ocean conditions.

Underwater Embodied Intelligence for Autonomous Robots: A Constraint-Coupled Perspective on Planning, Control, and Deployment

Abstract

Paper Structure (19 sections, 1 equation, 5 figures)

This paper contains 19 sections, 1 equation, 5 figures.

Introduction
An Overview of Underwater Embodied Intelligence
Definition and Scope
Why Underwater Embodiment is Structurally Distinct
A Systems Abstraction: Embodied Autonomy as Constraint-Coupled Optimization
Underwater Embodied Intelligence for Planning of Autonomous Robots
Underwater Embodied Intelligence for Control of Autonomous Robots
Applications
Persistent Environmental Monitoring and Adaptive Sampling
Offshore Infrastructure Inspection and Interaction-Aware Autonomy
Long-Horizon Exploration, Mapping, and Target-Seeking
Communication-Constrained Multi-Robot Cooperation
Challenges and Research Outlook
Environmental Uncertainty and Distribution Shift
Closed-Loop Reliability, Safety, and Verifiability
...and 4 more sections

Figures (5)

Figure 1: Application drivers and environmental constraints shaping underwater robotic autonomy. Representative application scenarios motivating the deployment of autonomous underwater robots include environmental monitoring, subsea resource exploration, and long-term ocean observation. These tasks require persistent operation, large-area coverage, and reduced human supervision. (a) Environmental monitoring of marine ecosystems using autonomous underwater vehicles 1. (b) Subsea resource exploration and infrastructure inspection in complex underwater environments 7. (c) Long-term ocean observation requiring sustained and wide-area data collection 147. To support such missions, most existing systems adopt a modular autonomy architecture integrating perception, planning, and control for environment understanding and decision-making. However, reliable autonomy in real oceans remains challenged by several environmental constraints: (d) hydrodynamic uncertainty from time-varying currents and fluid–vehicle interactions; (e) perception degradation due to turbidity, light attenuation, and sensing limitations; and (f) sparse communication caused by bandwidth-limited and latency-prone acoustic channels.
Figure 2: Structural distinctiveness of underwater embodiment compared to terrestrial and aerial systems. In terrestrial and aerial robotics, environmental interaction primarily acts as external disturbance 148149. By contrast, underwater systems operate within a dense, viscous, and dynamically coupled fluid medium that simultaneously reshapes controllability, observability, and communication reliability 152. Hydrodynamic effects couple translational and rotational dynamics; optical and acoustic sensing quality depends on motion and environmental conditions; and acoustic communication introduces sparse, delayed coordination. These factors create recursive cross-layer coupling across perception, planning, and control while also constraining coordination through sparse and delayed communication. As a result, autonomy in underwater systems cannot be decomposed into independent modules without structural fragility.
Figure 3: Embodied autonomy as constraint-coupled optimization in underwater systems. Rather than optimizing perception, planning, and control as isolated modules, underwater autonomy is more appropriately understood as a closed-loop, multi-objective regulation process over joint state, belief, and resource spaces. Mission utility, uncertainty regulation, and physical feasibility are structurally coupled: actions that improve task progress may increase localization drift, energy consumption, or dynamic instability, while uncertainty reduction and safety preservation can in turn reshape effective mission performance. The feasible policy manifold therefore represents the set of embodied strategies that balance these competing yet interdependent objectives, whereas sequential modular optimization is more likely to drive the system toward unstable or unsafe regimes under realistic ocean conditions.
Figure 4: Application domains as structural stress tests of embodied autonomy. Representative underwater mission scenarios impose distinct yet partially overlapping constraint profiles that collectively challenge the stability and robustness of embodied autonomous systems. (a) Persistent monitoring places sustained pressure on long-horizon epistemic stability, requiring reliable state estimation and uncertainty management over extended deployments while simultaneously maintaining strict energy endurance constraints. (b) Infrastructure inspection emphasizes proximity-driven safety and dynamic feasibility, where operations near complex structures demand precise motion control, robust collision avoidance, and stable perception under turbulent hydrodynamic conditions. (c) Exploration magnifies the accumulation of epistemic uncertainty and the difficulty of maintaining globally consistent maps in previously unobserved environments, often under limited sensing and intermittent localization cues. (d) Multi-robot cooperation introduces additional system-level complexity through coordination sparsity, communication delays, and cross-agent belief fragmentation, requiring decentralized reasoning and bandwidth-aware information exchange. Although the dominant operational pressures differ across these domains, each scenario ultimately exposes the same underlying structural coupling across mission objectives, uncertainty regulation, physical feasibility, communication reliability, and long-term resource sustainability.
Figure 5: Emerging research directions for resilient underwater embodied intelligence. Future underwater autonomy requires integrated strategies that regulate cross-layer coupling across perception, planning, and control, while explicitly accounting for communication, coordination, and deployment constraints. (a) Dreamer introduces a general model-based reinforcement learning algorithm that learns latent world models of environment dynamics and improves behaviour by imagining future trajectories, illustrating how physics-grounded world models can support anticipatory planning under uncertainty 198. (b) OceanGPT presents a domain-specific large language model for ocean science that integrates heterogeneous oceanographic data and instruction-generation frameworks, highlighting the potential of foundation models grounded in ocean-domain knowledge 42. (c) The Leash Actor–Critic (LAC) method proposes a constraint reinforcement learning framework that combines an emergency safety critic with Lagrangian optimization to restrict unsafe actions during AUV path planning, demonstrating how stability and safety constraints can be embedded within learning-enabled control 138. (d) A collaborative USV–AUV system integrates Fisher-information–based localization with reinforcement-learning-based cooperative planning, illustrating scalable decentralized coordination under communication and environmental constraints 118. (e) An open-source AUV path-planning benchmarking platform built on realistic underwater simulation environments provides standardized evaluation scenarios and metrics, highlighting the importance of field benchmarking and reproducible validation for underwater autonomy 199.

Underwater Embodied Intelligence for Autonomous Robots: A Constraint-Coupled Perspective on Planning, Control, and Deployment

Abstract

Underwater Embodied Intelligence for Autonomous Robots: A Constraint-Coupled Perspective on Planning, Control, and Deployment

Authors

Abstract

Table of Contents

Figures (5)