Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions
Guanwen Xie, Jingzehua Xu, Yimian Ding, Zhi Zhang, Shuai Zhang, Yi Li
TL;DR
This work tackles robust, high-precision AUV control under extreme sea conditions by combining RL with an adaptive S-Surface controller and augmenting it with large-language-model guidance. The proposed three-module architecture—RL-based S-Surface control, LLM-enhanced joint optimization, and environment-aware simulation—learns task-level policies while ensuring accurate, disturbance-rejecting low-level signals via a sigmoid-based surface controller. By formulating the RL problem as a Markov decision process with control-affine dynamics and jointly optimizing reward structures and surface parameters, the approach achieves superior performance in 3D data collection and target-tracking tasks compared to PID and SMC baselines, under both standard and intensified disturbance scenarios. The results suggest significant practical potential for robust AUV operation in complex, unpredictable ocean environments, with future work focused on real-world implementation and sim2real transfer.
Abstract
The adaptivity and maneuvering capabilities of Autonomous Underwater Vehicles (AUVs) have drawn significant attention in oceanic research, due to the unpredictable disturbances and strong coupling among the AUV's degrees of freedom. In this paper, we developed large language model (LLM)-enhanced reinforcement learning (RL)-based adaptive S-surface controller for AUVs. Specifically, LLMs are introduced for the joint optimization of controller parameters and reward functions in RL training. Using multi-modal and structured explicit task feedback, LLMs enable joint adjustments, balance multiple objectives, and enhance task-oriented performance and adaptability. In the proposed controller, the RL policy focuses on upper-level tasks, outputting task-oriented high-level commands that the S-surface controller then converts into control signals, ensuring cancellation of nonlinear effects and unpredictable external disturbances in extreme sea conditions. Under extreme sea conditions involving complex terrain, waves, and currents, the proposed controller demonstrates superior performance and adaptability in high-level tasks such as underwater target tracking and data collection, outperforming traditional PID and SMC controllers.
