Table of Contents
Fetching ...

DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing

Kuang Yuan, Shuo Han, Swarun Kumar, Bhiksha Raj

TL;DR

This paper proposes a multi-modal deep-learning framework to fuse the ultrasonic Doppler features and speech signals for wind noise reduction and shows that DeWinder can significantly improve the noise reduction capabilities of state-of-the-art speech enhancement models.

Abstract

The quality of audio recordings in outdoor environments is often degraded by the presence of wind. Mitigating the impact of wind noise on the perceptual quality of single-channel speech remains a significant challenge due to its non-stationary characteristics. Prior work in noise suppression treats wind noise as a general background noise without explicit modeling of its characteristics. In this paper, we leverage ultrasound as an auxiliary modality to explicitly sense the airflow and characterize the wind noise. We propose a multi-modal deep-learning framework to fuse the ultrasonic Doppler features and speech signals for wind noise reduction. Our results show that DeWinder can significantly improve the noise reduction capabilities of state-of-the-art speech enhancement models.

DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing

TL;DR

This paper proposes a multi-modal deep-learning framework to fuse the ultrasonic Doppler features and speech signals for wind noise reduction and shows that DeWinder can significantly improve the noise reduction capabilities of state-of-the-art speech enhancement models.

Abstract

The quality of audio recordings in outdoor environments is often degraded by the presence of wind. Mitigating the impact of wind noise on the perceptual quality of single-channel speech remains a significant challenge due to its non-stationary characteristics. Prior work in noise suppression treats wind noise as a general background noise without explicit modeling of its characteristics. In this paper, we leverage ultrasound as an auxiliary modality to explicitly sense the airflow and characterize the wind noise. We propose a multi-modal deep-learning framework to fuse the ultrasonic Doppler features and speech signals for wind noise reduction. Our results show that DeWinder can significantly improve the noise reduction capabilities of state-of-the-art speech enhancement models.
Paper Structure (13 sections, 3 equations, 5 figures, 1 table)

This paper contains 13 sections, 3 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: DeWinder uses ultrasound to sense and reduce wind noise.
  • Figure 2: The airflow induces wind noise, while shaping the ultrasound transmission.
  • Figure 3: Modular Design that can be adapted to existing speech enhancement models.
  • Figure 4: Fusion Module for DEMUCS based on Masking
  • Figure 5: Performance improvement at different SNRs