Table of Contents
Fetching ...

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

Ji Xu, Yuan Xie, Wenchao Wang

TL;DR

The paper tackles underwater acoustic target recognition under limited data, where traditional data augmentation risks biasing models toward non-true distributions. It introduces two strategies: smoothness-inducing regularization, which leverages simulated signals only in a KL-divergence based regularization term to smooth decision boundaries, and Local Masking and Replicating (LMR), a spectrogram-specific augmentation that creates mixed inputs to capture inter-class relationships. Implemented on a ResNet-18 with multi-head attention and evaluated across Shipsear, DeepShip, and a private DTIL dataset, the approach yields consistent improvements, with LMR providing substantial gains in data-scarce settings and the combination with regularization offering the best performance. Visualizations (confusion matrices and CAM) support improved generalization and reduced local bias, and robustness analyses show enhanced resilience to perturbations. Limitations include limited gains in data-rich regimes and potential negative effects of LMR on some tasks beyond recognition. The work provides a practical benchmark and demonstrates that carefully designed regularization and spectrogram-based augmentation can achieve robust underwater target recognition with limited data.

Abstract

Underwater acoustic target recognition is a challenging task owing to the intricate underwater environments and limited data availability. Insufficient data can hinder the ability of recognition systems to support complex modeling, thus impeding their advancement. To improve the generalization capacity of recognition models, techniques such as data augmentation have been employed to simulate underwater signals and diversify data distribution. However, the complexity of underwater environments can cause the simulated signals to deviate from real scenarios, resulting in biased models that are misguided by non-true data. In this study, we propose two strategies to enhance the generalization ability of models in the case of limited data while avoiding the risk of performance degradation. First, as an alternative to traditional data augmentation, we utilize smoothness-inducing regularization, which only incorporates simulated signals in the regularization term. Additionally, we propose a specialized spectrogram-based data augmentation strategy, namely local masking and replicating (LMR), to capture inter-class relationships. Our experiments and visualization analysis demonstrate the superiority of our proposed strategies.

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

TL;DR

The paper tackles underwater acoustic target recognition under limited data, where traditional data augmentation risks biasing models toward non-true distributions. It introduces two strategies: smoothness-inducing regularization, which leverages simulated signals only in a KL-divergence based regularization term to smooth decision boundaries, and Local Masking and Replicating (LMR), a spectrogram-specific augmentation that creates mixed inputs to capture inter-class relationships. Implemented on a ResNet-18 with multi-head attention and evaluated across Shipsear, DeepShip, and a private DTIL dataset, the approach yields consistent improvements, with LMR providing substantial gains in data-scarce settings and the combination with regularization offering the best performance. Visualizations (confusion matrices and CAM) support improved generalization and reduced local bias, and robustness analyses show enhanced resilience to perturbations. Limitations include limited gains in data-rich regimes and potential negative effects of LMR on some tasks beyond recognition. The work provides a practical benchmark and demonstrates that carefully designed regularization and spectrogram-based augmentation can achieve robust underwater target recognition with limited data.

Abstract

Underwater acoustic target recognition is a challenging task owing to the intricate underwater environments and limited data availability. Insufficient data can hinder the ability of recognition systems to support complex modeling, thus impeding their advancement. To improve the generalization capacity of recognition models, techniques such as data augmentation have been employed to simulate underwater signals and diversify data distribution. However, the complexity of underwater environments can cause the simulated signals to deviate from real scenarios, resulting in biased models that are misguided by non-true data. In this study, we propose two strategies to enhance the generalization ability of models in the case of limited data while avoiding the risk of performance degradation. First, as an alternative to traditional data augmentation, we utilize smoothness-inducing regularization, which only incorporates simulated signals in the regularization term. Additionally, we propose a specialized spectrogram-based data augmentation strategy, namely local masking and replicating (LMR), to capture inter-class relationships. Our experiments and visualization analysis demonstrate the superiority of our proposed strategies.
Paper Structure (19 sections, 6 equations, 7 figures, 11 tables)

This paper contains 19 sections, 6 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: The pipeline of our preprocessing process, along with the generation of noisy signals and the extraction of acoustic features.
  • Figure 2: The structure of our backbone model: ResNet-18 with multi-head attention. "Conv" represents the convolutional layer, "BN" represents the two-dimensional batch normalization layer and "FC" represents the fully-connected layer.
  • Figure 3: Comparison of the data augmentation and our smoothness-inducing regularization. For brevity, we omit the backpropagation of loss.
  • Figure 4: An example for local masking and replicating (LMR). The samples in the figure are selected from two 30-second segments in Shipsear.
  • Figure 5: Confusion matrix heat map for two datasets. The three subplots represent: 1. baseline; 2. with smooth reg; 3. with smooth reg and LMR. "smooth reg" represents smoothness-inducing regularization for short.
  • ...and 2 more figures