Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

Ji Xu; Yuan Xie; Wenchao Wang

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

Ji Xu, Yuan Xie, Wenchao Wang

TL;DR

The paper tackles underwater acoustic target recognition under limited data, where traditional data augmentation risks biasing models toward non-true distributions. It introduces two strategies: smoothness-inducing regularization, which leverages simulated signals only in a KL-divergence based regularization term to smooth decision boundaries, and Local Masking and Replicating (LMR), a spectrogram-specific augmentation that creates mixed inputs to capture inter-class relationships. Implemented on a ResNet-18 with multi-head attention and evaluated across Shipsear, DeepShip, and a private DTIL dataset, the approach yields consistent improvements, with LMR providing substantial gains in data-scarce settings and the combination with regularization offering the best performance. Visualizations (confusion matrices and CAM) support improved generalization and reduced local bias, and robustness analyses show enhanced resilience to perturbations. Limitations include limited gains in data-rich regimes and potential negative effects of LMR on some tasks beyond recognition. The work provides a practical benchmark and demonstrates that carefully designed regularization and spectrogram-based augmentation can achieve robust underwater target recognition with limited data.

Abstract

Underwater acoustic target recognition is a challenging task owing to the intricate underwater environments and limited data availability. Insufficient data can hinder the ability of recognition systems to support complex modeling, thus impeding their advancement. To improve the generalization capacity of recognition models, techniques such as data augmentation have been employed to simulate underwater signals and diversify data distribution. However, the complexity of underwater environments can cause the simulated signals to deviate from real scenarios, resulting in biased models that are misguided by non-true data. In this study, we propose two strategies to enhance the generalization ability of models in the case of limited data while avoiding the risk of performance degradation. First, as an alternative to traditional data augmentation, we utilize smoothness-inducing regularization, which only incorporates simulated signals in the regularization term. Additionally, we propose a specialized spectrogram-based data augmentation strategy, namely local masking and replicating (LMR), to capture inter-class relationships. Our experiments and visualization analysis demonstrate the superiority of our proposed strategies.

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

TL;DR

Abstract

Paper Structure (19 sections, 6 equations, 7 figures, 11 tables)

This paper contains 19 sections, 6 equations, 7 figures, 11 tables.

Introduction
Methodology
Data Preprocessing and Feature Extraction
Backbone Model
Smoothness-inducing Regularization
Local Masking and Replicating
Optimization objective
Experiment Setup
Datasets
Division of data
Effective frequency bands
Parameters Setup
Results and analysis
Main results
Visualization
...and 4 more sections

Figures (7)

Figure 1: The pipeline of our preprocessing process, along with the generation of noisy signals and the extraction of acoustic features.
Figure 2: The structure of our backbone model: ResNet-18 with multi-head attention. "Conv" represents the convolutional layer, "BN" represents the two-dimensional batch normalization layer and "FC" represents the fully-connected layer.
Figure 3: Comparison of the data augmentation and our smoothness-inducing regularization. For brevity, we omit the backpropagation of loss.
Figure 4: An example for local masking and replicating (LMR). The samples in the figure are selected from two 30-second segments in Shipsear.
Figure 5: Confusion matrix heat map for two datasets. The three subplots represent: 1. baseline; 2. with smooth reg; 3. with smooth reg and LMR. "smooth reg" represents smoothness-inducing regularization for short.
...and 2 more figures

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

TL;DR

Abstract

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)