Table of Contents
Fetching ...

Passive Underwater Acoustic Signal Separation based on Feature Decoupling Dual-path Network

Yucheng Liu, Longyu Jiang

TL;DR

The paper tackles the challenge of separating passive underwater ship radiated noise by introducing Indiformer, a time-domain dual-path network that decouples mixed-signal features and leverages a GL-Transformer to fuse local and global context. By transforming reshaped signal blocks into a space with more independent features and applying a dual-path processing paradigm, the method addresses long-range dependencies and non-stationarity inherent in underwater acoustics. Evaluations on ShipsEar and DeepShip demonstrate improved segmentation and signal-to-noise metrics (SNR, SegSNR, SISNRi) over several strong baselines, supported by ablations confirming the value of feature decoupling. The work advances underwater acoustic signal separation with a scalable, robust approach that is applicable to passive sonar analysis and related maritime sensing tasks.

Abstract

Signal separation in the passive underwater acoustic domain has heavily relied on deep learning techniques to isolate ship radiated noise. However, the separation networks commonly used in this domain stem from speech separation applications and may not fully consider the unique aspects of underwater acoustics beforehand, such as the influence of different propagation media, signal frequencies and modulation characteristics. This oversight highlights the need for tailored approaches that account for the specific characteristics of underwater sound propagation. This study introduces a novel temporal network designed to separate ship radiated noise by employing a dual-path model and a feature decoupling approach. The mixed signals' features are transformed into a space where they exhibit greater independence, with each dimension's significance decoupled. Subsequently, a fusion of local and global attention mechanisms is employed in the separation layer. Extensive comparisons showcase the effectiveness of this method when compared to other prevalent network models, as evidenced by its performance in the ShipsEar and DeepShip datasets.

Passive Underwater Acoustic Signal Separation based on Feature Decoupling Dual-path Network

TL;DR

The paper tackles the challenge of separating passive underwater ship radiated noise by introducing Indiformer, a time-domain dual-path network that decouples mixed-signal features and leverages a GL-Transformer to fuse local and global context. By transforming reshaped signal blocks into a space with more independent features and applying a dual-path processing paradigm, the method addresses long-range dependencies and non-stationarity inherent in underwater acoustics. Evaluations on ShipsEar and DeepShip demonstrate improved segmentation and signal-to-noise metrics (SNR, SegSNR, SISNRi) over several strong baselines, supported by ablations confirming the value of feature decoupling. The work advances underwater acoustic signal separation with a scalable, robust approach that is applicable to passive sonar analysis and related maritime sensing tasks.

Abstract

Signal separation in the passive underwater acoustic domain has heavily relied on deep learning techniques to isolate ship radiated noise. However, the separation networks commonly used in this domain stem from speech separation applications and may not fully consider the unique aspects of underwater acoustics beforehand, such as the influence of different propagation media, signal frequencies and modulation characteristics. This oversight highlights the need for tailored approaches that account for the specific characteristics of underwater sound propagation. This study introduces a novel temporal network designed to separate ship radiated noise by employing a dual-path model and a feature decoupling approach. The mixed signals' features are transformed into a space where they exhibit greater independence, with each dimension's significance decoupled. Subsequently, a fusion of local and global attention mechanisms is employed in the separation layer. Extensive comparisons showcase the effectiveness of this method when compared to other prevalent network models, as evidenced by its performance in the ShipsEar and DeepShip datasets.

Paper Structure

This paper contains 10 sections, 17 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Network Architecture Overview. Two types of mixed signals are distinguished by red and blue, and different features are represented by triangles and rectangles. Feature decoupling is performed first, and then signal separation is performed through the architecture based on the proposed Multi-scale Dual-path Transformer.
  • Figure 2: The computation strategy for Local and Global Attention in GL-Transformer.
  • Figure 3: The separation results are visualized in the form of time-domain waveforms. The two images on the top row represent the original audio. The second line shows the separation results obtained from the proposed model(Indiformer). The last line shows the absolute difference between the pure signal and its corresponding separated signal.
  • Figure 4: The ablation experiments conducted on the feature decoupling part were tested after 30 epochs of training.