A comprehensive review of datasets and deep learning techniques for vision in Unmanned Surface Vehicles
Linh Trinh, Siegfried Mercelis, Ali Anwar
TL;DR
The paper addresses the fragmentation in USV vision research by delivering a comprehensive review of real-world USV vision datasets (38 in total) and a taxonomy of deep learning techniques spanning single-sensor RGB methods and multi-modal fusion. It highlights that object detection and segmentation dominate the dataset landscape, but public 3D data, calibration, and metadata are scarce, limiting foundation models and cross-domain generalization. The analysis reveals strong reliance on camera data, uneven dataset openness across regions, and notable gaps in privacy, domain adaptation, and integration of language-model capabilities. The work underscores the practical impact of developing standardized, multi-sensor, and privacy-preserving USV vision resources to enable robust perception and safe autonomous maritime operations.
Abstract
Unmanned Surface Vehicles (USVs) have emerged as a major platform in maritime operations, capable of supporting a wide range of applications. USVs can help reduce labor costs, increase safety, save energy, and allow for difficult unmanned tasks in harsh maritime environments. With the rapid development of USVs, many vision tasks such as detection and segmentation become increasingly important. Datasets play an important role in encouraging and improving the research and development of reliable vision algorithms for USVs. In this regard, a large number of recent studies have focused on the release of vision datasets for USVs. Along with the development of datasets, a variety of deep learning techniques have also been studied, with a focus on USVs. However, there is a lack of a systematic review of recent studies in both datasets and vision techniques to provide a comprehensive picture of the current development of vision on USVs, including limitations and trends. In this study, we provide a comprehensive review of both USV datasets and deep learning techniques for vision tasks. Our review was conducted using a large number of vision datasets from USVs. We elaborate several challenges and potential opportunities for research and development in USV vision based on a thorough analysis of current datasets and deep learning techniques.
