Table of Contents
Fetching ...

A Two-Stage Band-Split Mamba-2 Network For Music Separation

Jinglin Bai, Yuan Fang, Jiajie Wang, Xueliang Zhang

TL;DR

This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance.

Abstract

Music source separation (MSS) aims to separate mixed music into its distinct tracks, such as vocals, bass, drums, and more. MSS is considered to be a challenging audio separation task due to the complexity of music signals. Although the RNN and Transformer architecture are not perfect, they are commonly used to model the music sequence for MSS. Recently, Mamba-2 has already demonstrated high efficiency in various sequential modeling tasks, but its superiority has not been investigated in MSS. This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance. Experiments confirm the superiority of bidirectional Mamba-2 and the effectiveness of the two-stage network in MSS. The source code is publicly accessible at https://github.com/baijinglin/TS-BSmamba2.

A Two-Stage Band-Split Mamba-2 Network For Music Separation

TL;DR

This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance.

Abstract

Music source separation (MSS) aims to separate mixed music into its distinct tracks, such as vocals, bass, drums, and more. MSS is considered to be a challenging audio separation task due to the complexity of music signals. Although the RNN and Transformer architecture are not perfect, they are commonly used to model the music sequence for MSS. Recently, Mamba-2 has already demonstrated high efficiency in various sequential modeling tasks, but its superiority has not been investigated in MSS. This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance. Experiments confirm the superiority of bidirectional Mamba-2 and the effectiveness of the two-stage network in MSS. The source code is publicly accessible at https://github.com/baijinglin/TS-BSmamba2.
Paper Structure (12 sections, 4 equations, 3 figures, 2 tables)

This paper contains 12 sections, 4 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: (A) The overall pipeline of TS-BSMAMBA2. (B) The design of the band split module. (C) The BMAMBA2-DualNet is designed based on the BMAMBA2 structure. (D) The design of the band merge module, with each playing a different role in the two stages. (E) The design of the fusion module is to integrate the features from the first and second stages. (F) Symbol explanation of overall pipeline.
  • Figure 2: Structure of the BMAMBA2 block
  • Figure 3: Spectrogram examples of the vocal track.