ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation

Siqi Song; Fulin Wu; Zhong-Qiu Wang

ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation

Siqi Song, Fulin Wu, Zhong-Qiu Wang

Abstract

Due to the absence of clean reference signals and spatial cues, monaural unsupervised speech dereverberation is a challenging ill-posed inverse problem. To realize it, we propose augmented reverberant-target training (ARTT), which consists of two stages. In the first stage, reverberant-target training (RTT) is proposed to first further reverberate the observed reverberant mixture signal, and then train a deep neural network (DNN) to recover the observed reverberant mixture via discriminative training. Although the target signal to fit is reverberant, we find that the resulting DNN can effectively reduce reverberation. In the second stage, an online self-distillation mechanism based on the mean-teacher algorithm is proposed to further improve dereverberation. Evaluation results demonstrate that ARTT achieves strong unsupervised dereverberation performance, significantly outperforming previous baselines.

ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation

Abstract

Paper Structure (12 sections, 10 equations, 4 figures, 2 tables)

This paper contains 12 sections, 10 equations, 4 figures, 2 tables.

Introduction
ARTT
Stage I: Reverberant-Target Training
Stage II: Self-Distillation
Loss Function
Experimental Setup
Dataset and Evaluation Metrics
Method Configuration
Baselines
Evaluation Results
Conclusion
Generative AI Use Disclosure

Figures (4)

Figure 1: ARTT overview.
Figure 2: Example synthetic statistical RTF $h_\text{syn}$ for RTT.
Figure 3: Spectrogram visualization of reverberant mixture, clean reference signals (direct-path), and outputs of ARTTStageI and ARTTStageII.
Figure 4: SI-SDR (dB) and PESQ comparison between ARTTStageI and ARTTStageII at different ranges of input SNR levels. Best viewed in color.

ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation

Abstract

ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation

Authors

Abstract

Table of Contents

Figures (4)