Attention-Based Beamformer For Multi-Channel Speech Enhancement

Jinglin Bai; Hao Li; Xueliang Zhang; Fei Chen

Attention-Based Beamformer For Multi-Channel Speech Enhancement

Jinglin Bai, Hao Li, Xueliang Zhang, Fei Chen

TL;DR

This paper proposes an attention-based mechanism to calculate the speech and noise SCMs and then apply MVDR to obtain the enhanced speech and demonstrates that the proposed method outperforms baselines with reduced computation and fewer parameters under various conditions.

Abstract

Minimum Variance Distortionless Response (MVDR) is a classical adaptive beamformer that theoretically ensures the distortionless transmission of signals in the target direction, which makes it popular in real applications. Its noise reduction performance actually depends on the accuracy of the noise and speech spatial covariance matrices (SCMs) estimation. Time-frequency masks are often used to compute these SCMs. However, most mask-based beamforming methods typically assume that the sources are stationary, ignoring the case of moving sources, which leads to performance degradation. In this paper, we propose an attention-based mechanism to calculate the speech and noise SCMs and then apply MVDR to obtain the enhanced speech. To fully incorporate spatial information, the inplace convolution operator and frequency-independent LSTM are applied to facilitate SCMs estimation. The model is optimized in an end-to-end manner. Experiments demonstrate that the proposed method outperforms baselines with reduced computation and fewer parameters under various conditions.

Attention-Based Beamformer For Multi-Channel Speech Enhancement

TL;DR

Abstract

Paper Structure (14 sections, 6 equations, 1 figure, 4 tables)

This paper contains 14 sections, 6 equations, 1 figure, 4 tables.

Introduction
Methodology
Problem definition
An overview of ABIC-MVDR
IGCRN backbone network (IBN)
Inplace Self-Attention Module (ISAM)
Minimum Variance Distortionless Response Beamformer
Experiments
Dataset and setting
Model configuration
Experiment results and analysis
Experimental results
Ablation experiments
Conclusion

Figures (1)

Figure 1: (a) An overview of ABIC-MVDR. (b) Structure of encoder and decoder. (c) Inplace self-attention module (ISAM).

Attention-Based Beamformer For Multi-Channel Speech Enhancement

TL;DR

Abstract

Attention-Based Beamformer For Multi-Channel Speech Enhancement

Authors

TL;DR

Abstract

Table of Contents

Figures (1)