S3Simulator: A benchmarking Side Scan Sonar Simulator dataset for Underwater Image Analysis
Kamal Basha S, Athira Nambiar
TL;DR
This paper tackles the shortage of publicly available underwater sonar datasets by presenting S3Simulator, a synthetic side-scan sonar dataset generated via a five-stage pipeline that combines silhouette sourcing, SAM-based segmentation, SelfCAD 3D modeling, Gazebo-based rendering, and computational imaging to produce realistic sonar imagery. The authors demonstrate the utility of synthetic data for underwater image analysis by benchmarking classical ML and DL classifiers, showing that combining real and synthetic data improves performance (e.g., DenseNet121 reaching up to 96% accuracy). Key contributions include a publicly accessible dataset of 600 ship and 600 plane images, an integrated pipeline linking 3D modeling with physics-based sonar rendering, and validation across real and simulated domains with insights on shadows and nadir-zone realism. The work positions S3Simulator as a scalable, economical resource to accelerate AI development for marine exploration and surveillance, with promising avenues for future improvements such as broader object classes and diffusion-based augmentation.
Abstract
Acoustic sonar imaging systems are widely used for underwater surveillance in both civilian and military sectors. However, acquiring high-quality sonar datasets for training Artificial Intelligence (AI) models confronts challenges such as limited data availability, financial constraints, and data confidentiality. To overcome these challenges, we propose a novel benchmark dataset of Simulated Side-Scan Sonar images, which we term as 'S3Simulator dataset'. Our dataset creation utilizes advanced simulation techniques to accurately replicate underwater conditions and produce diverse synthetic sonar imaging. In particular, the cutting-edge AI segmentation tool i.e. Segment Anything Model (SAM) is leveraged for optimally isolating and segmenting the object images, such as ships and planes, from real scenes. Further, advanced Computer-Aided Design tools i.e. SelfCAD and simulation software such as Gazebo are employed to create the 3D model and to optimally visualize within realistic environments, respectively. Further, a range of computational imaging techniques are employed to improve the quality of the data, enabling the AI models for the analysis of the sonar images. Extensive analyses are carried out on S3simulator as well as real sonar datasets to validate the performance of AI models for underwater object classification. Our experimental results highlight that the S3Simulator dataset will be a promising benchmark dataset for research on underwater image analysis. https://github.com/bashakamal/S3Simulator.
