MC-SEMamba: A Simple Multi-channel Extension of SEMamba

Wen-Yuan Ting; Wenze Ren; Rong Chao; Hsin-Yi Lin; Yu Tsao; Fan-Gang Zeng

MC-SEMamba: A Simple Multi-channel Extension of SEMamba

Wen-Yuan Ting, Wenze Ren, Rong Chao, Hsin-Yi Lin, Yu Tsao, Fan-Gang Zeng

TL;DR

MC-SEMamba extends SEMamba to multi-channel speech enhancement with a minimal front-end expansion, enabling effective learning of spatial information from microphone arrays. Built on Mamba-based blocks and a MetricGAN+-driven objective, it yields competitive SE metrics on CHiME3 while keeping parameter growth modest. Experimental results show that adding microphones improves performance, with five microphones often delivering optimal PESQ and STOI gains, highlighting practical benefits for compact, high-quality multi-channel SE systems.

Abstract

Transformer-based models have become increasingly popular and have impacted speech-processing research owing to their exceptional performance in sequence modeling. Recently, a promising model architecture, Mamba, has emerged as a potential alternative to transformer-based models because of its efficient modeling of long sequences. In particular, models like SEMamba have demonstrated the effectiveness of the Mamba architecture in single-channel speech enhancement. This paper aims to adapt SEMamba for multi-channel applications with only a small increase in parameters. The resulting system, MC-SEMamba, achieved results on the CHiME3 dataset that were comparable or even superior to several previous baseline models. Additionally, we found that increasing the number of microphones from 1 to 6 improved the speech enhancement performance of MC-SEMamba.

MC-SEMamba: A Simple Multi-channel Extension of SEMamba

TL;DR

Abstract

Paper Structure (13 sections, 3 equations, 2 figures, 3 tables)

This paper contains 13 sections, 3 equations, 2 figures, 3 tables.

Introduction
Related Works
Mamba
SEMamba
MC-SEMamba
Experimental Results and Analysis
Dataset
Network configurations
Training Specifics
Experimental Analysis
Comparison With Previous Methods
Impact of Choice of Microphones
Conclusion and Future Work

Figures (2)

Figure 1: Diagram of S6
Figure 2: MC-SEMamba generator diagram. The architectural difference between SEMamba and MC-SEMamba is in $\boldsymbol{\mathit{g}}_\mathrm{CNN}$. Different blocks with the same color may have different types of parameters (e.g., kernel size). Operations such as tensor permutation are omitted for simplicity. The learnable sigmoid was proposed in fu2021metricgan+.

MC-SEMamba: A Simple Multi-channel Extension of SEMamba

TL;DR

Abstract

MC-SEMamba: A Simple Multi-channel Extension of SEMamba

Authors

TL;DR

Abstract

Table of Contents

Figures (2)