A Gated Cross-domain Collaborative Network for Underwater Object Detection
Linhui Dai, Hong Liu, Pinhao Song, Mengyuan Liu
TL;DR
This paper tackles underwater object detection under challenging low-visibility conditions by introducing GCC-Net, a cross-domain framework that jointly processes raw and UIE-enhanced images. It pairs a real-time online UIE, water-MSR, with a cross-domain feature interaction (CFI) module based on multi-head cross-attention and a gated feature fusion (GFF) mechanism, enabling adaptive fusion of complementary information from both domains. The approach is evaluated on four underwater datasets (DUO, Brackish, TrashCan, WPBB) and achieves state-of-the-art performance across diverse scenes, with real-time inference suitable for deployment on AUVs. The work establishes a new cross-domain paradigm for underwater perception, with implications for other multi-modal computer vision tasks requiring robust cross-domain information exchange.
Abstract
Underwater object detection (UOD) plays a significant role in aquaculture and marine environmental protection. Considering the challenges posed by low contrast and low-light conditions in underwater environments, several underwater image enhancement (UIE) methods have been proposed to improve the quality of underwater images. However, only using the enhanced images does not improve the performance of UOD, since it may unavoidably remove or alter critical patterns and details of underwater objects. In contrast, we believe that exploring the complementary information from the two domains is beneficial for UOD. The raw image preserves the natural characteristics of the scene and texture information of the objects, while the enhanced image improves the visibility of underwater objects. Based on this perspective, we propose a Gated Cross-domain Collaborative Network (GCC-Net) to address the challenges of poor visibility and low contrast in underwater environments, which comprises three dedicated components. Firstly, a real-time UIE method is employed to generate enhanced images, which can improve the visibility of objects in low-contrast areas. Secondly, a cross-domain feature interaction module is introduced to facilitate the interaction and mine complementary information between raw and enhanced image features. Thirdly, to prevent the contamination of unreliable generated results, a gated feature fusion module is proposed to adaptively control the fusion ratio of cross-domain information. Our method presents a new UOD paradigm from the perspective of cross-domain information interaction and fusion. Experimental results demonstrate that the proposed GCC-Net achieves state-of-the-art performance on four underwater datasets.
