Table of Contents
Fetching ...

Syn2Real Domain Generalization for Underwater Mine-like Object Detection Using Side-Scan Sonar

Aayush Agrawal, Aniruddh Sikdar, Rajini Makam, Suresh Sundaram, Suresh Kumar Besai, Mahesh Gopi

TL;DR

This work tackles the data scarcity challenge in underwater mine detection by introducing Syn2Real domain generalization, using diffusion-model–generated side-scan sonar images to augment real data. The authors compare DCGAN and diffusion models (DDPM, DDIM), tune hyperparameters, and train a Mask R-CNN on seven mixed datasets consisting of synthetic and real SSS images of conical and cylindrical mine-like objects. They demonstrate that including synthetic data substantially improves downstream semantic segmentation performance (approximately a 60% AP gain when combining synthetic with real data) and identify DDIM as particularly effective for domain generalization due to induced variability. The study also contributes a bespoke MLO SSS dataset and provides insights into the trade-offs between image quality metrics (FID/KID) and real-world generalization, with practical implications for deploying underwater mine detection systems.

Abstract

Underwater mine detection with deep learning suffers from limitations due to the scarcity of real-world data. This scarcity leads to overfitting, where models perform well on training data but poorly on unseen data. This paper proposes a Syn2Real (Synthetic to Real) domain generalization approach using diffusion models to address this challenge. We demonstrate that synthetic data generated with noise by DDPM and DDIM models, even if not perfectly realistic, can effectively augment real-world samples for training. The residual noise in the final sampled images improves the model's ability to generalize to real-world data with inherent noise and high variation. The baseline Mask-RCNN model when trained on a combination of synthetic and original training datasets, exhibited approximately a 60% increase in Average Precision (AP) compared to being trained solely on the original training data. This significant improvement highlights the potential of Syn2Real domain generalization for underwater mine detection tasks.

Syn2Real Domain Generalization for Underwater Mine-like Object Detection Using Side-Scan Sonar

TL;DR

This work tackles the data scarcity challenge in underwater mine detection by introducing Syn2Real domain generalization, using diffusion-model–generated side-scan sonar images to augment real data. The authors compare DCGAN and diffusion models (DDPM, DDIM), tune hyperparameters, and train a Mask R-CNN on seven mixed datasets consisting of synthetic and real SSS images of conical and cylindrical mine-like objects. They demonstrate that including synthetic data substantially improves downstream semantic segmentation performance (approximately a 60% AP gain when combining synthetic with real data) and identify DDIM as particularly effective for domain generalization due to induced variability. The study also contributes a bespoke MLO SSS dataset and provides insights into the trade-offs between image quality metrics (FID/KID) and real-world generalization, with practical implications for deploying underwater mine detection systems.

Abstract

Underwater mine detection with deep learning suffers from limitations due to the scarcity of real-world data. This scarcity leads to overfitting, where models perform well on training data but poorly on unseen data. This paper proposes a Syn2Real (Synthetic to Real) domain generalization approach using diffusion models to address this challenge. We demonstrate that synthetic data generated with noise by DDPM and DDIM models, even if not perfectly realistic, can effectively augment real-world samples for training. The residual noise in the final sampled images improves the model's ability to generalize to real-world data with inherent noise and high variation. The baseline Mask-RCNN model when trained on a combination of synthetic and original training datasets, exhibited approximately a 60% increase in Average Precision (AP) compared to being trained solely on the original training data. This significant improvement highlights the potential of Syn2Real domain generalization for underwater mine detection tasks.

Paper Structure

This paper contains 8 sections, 7 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1:
  • Figure 2:
  • Figure 3: Synthetically generated images of $M_1$ (Conical mines): (a) DDIM (b) DDPM (c) GAN and $M_2$ (Cylindrical mines): (d) DDIM (e) DDPM (f) GAN
  • Figure 4: Segmentation Results from Model Trained on Different Datasets