Table of Contents
Fetching ...

Underwater Human-Robot and Human-Swarm Interaction: A Review and Perspective

Sara Aldhaheri, Federico Renda, Giulia De Masi

TL;DR

This paper surveys underwater human-robot interaction (UHRI) and its extension to human-swarm interaction, focusing on gesture-based control to enable intuitive diver-robot collaboration. It analyzes gesture semantics (static vs dynamic), front-end perception and back-end interpretation, and publicly available datasets (CADDY, SCUBANet, Glove-based, SUIM, VDD-C), highlighting robustness challenges posed by underwater optics and acoustics. It proposes a framework for UHRI-enabled swarms, leveraging hierarchical master-agent control, digital twins, and metaverse-inspired simulations to scale supervision while maintaining safety. The work identifies remaining gaps—data realism, reliable perception under adverse conditions, and secure human-in-the-loop control—and outlines directions to realize robust, real-time underwater multi-robot collaboration with divers.

Abstract

There has been a growing interest in extending the capabilities of autonomous underwater vehicles (AUVs) in subsea missions, particularly in integrating underwater human-robot interaction (UHRI) for control. UHRI and its subfield,underwater gesture recognition (UGR), play a significant role in enhancing diver-robot communication for marine research. This review explores the latest developments in UHRI and examines its promising applications for multi-robot systems. With the developments in UGR, opportunities are presented for underwater robots to work alongside human divers to increase their functionality. Human gestures creates a seamless and safe collaborative environment where divers and robots can interact more efficiently. By highlighting the state-of-the-art in this field, we can potentially encourage advancements in underwater multi-robot system (UMRS) blending the natural communication channels of human-robot interaction with the multi-faceted coordination capabilities of underwater swarms,thus enhancing robustness in complex aquatic environments.

Underwater Human-Robot and Human-Swarm Interaction: A Review and Perspective

TL;DR

This paper surveys underwater human-robot interaction (UHRI) and its extension to human-swarm interaction, focusing on gesture-based control to enable intuitive diver-robot collaboration. It analyzes gesture semantics (static vs dynamic), front-end perception and back-end interpretation, and publicly available datasets (CADDY, SCUBANet, Glove-based, SUIM, VDD-C), highlighting robustness challenges posed by underwater optics and acoustics. It proposes a framework for UHRI-enabled swarms, leveraging hierarchical master-agent control, digital twins, and metaverse-inspired simulations to scale supervision while maintaining safety. The work identifies remaining gaps—data realism, reliable perception under adverse conditions, and secure human-in-the-loop control—and outlines directions to realize robust, real-time underwater multi-robot collaboration with divers.

Abstract

There has been a growing interest in extending the capabilities of autonomous underwater vehicles (AUVs) in subsea missions, particularly in integrating underwater human-robot interaction (UHRI) for control. UHRI and its subfield,underwater gesture recognition (UGR), play a significant role in enhancing diver-robot communication for marine research. This review explores the latest developments in UHRI and examines its promising applications for multi-robot systems. With the developments in UGR, opportunities are presented for underwater robots to work alongside human divers to increase their functionality. Human gestures creates a seamless and safe collaborative environment where divers and robots can interact more efficiently. By highlighting the state-of-the-art in this field, we can potentially encourage advancements in underwater multi-robot system (UMRS) blending the natural communication channels of human-robot interaction with the multi-faceted coordination capabilities of underwater swarms,thus enhancing robustness in complex aquatic environments.
Paper Structure (18 sections, 5 figures, 1 table)

This paper contains 18 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Diver using hand gestures to communicate command to underwater vehicle mivskovic2015caddy.
  • Figure 2: Hand gestures of diver used in the CADDY dataset gomez2019caddy.
  • Figure 3: UHRI topics to consider when developing the technology. The literature covers the range of applications (red), the relevant diving gestures (teal), gesture recognition tools (yellow), existing dataset (purple), and robustness of this technology (green).
  • Figure 4: Individual control of robots in (a) compared to multi-level autonomy in (b).
  • Figure 5: Suggested metaverse framework using UHRI.