MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition

Mehran Shabanpour; Kasra Rad; Sadaf Khademi; Arash Mohammadi

MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition

Mehran Shabanpour, Kasra Rad, Sadaf Khademi, Arash Mohammadi

TL;DR

The paper tackles inter-session variability in HD-sEMG gesture recognition by introducing MoEMba, a Mamba-based Mixture of Experts framework that combines Selective State-Space Models with wavelet feature modulation and channel attention. This approach captures long-range temporal dependencies and cross-channel interactions while maintaining computational efficiency suitable for real-time use, achieving a balanced accuracy of $56.9\%$ on CapgMyo DB-b and demonstrating robustness to session shifts. Key contributions include the first application of Mamba to HD-sEMG gesture recognition, an adaptive MoE design with sparsity and balance constraints, and the integration of WTFM to fuse time-domain and frequency-domain information. The results show competitive performance with lower complexity than transformer-based models, highlighting a practical path toward reliable, high-density EMG-driven HCI, prosthetics control, and neuromuscular applications; future work may explore synthetic data augmentation and multi-modal integrations.

Abstract

High-Density surface Electromyography (HDsEMG) has emerged as a pivotal resource for Human-Computer Interaction (HCI), offering direct insights into muscle activities and motion intentions. However, a significant challenge in practical implementations of HD-sEMG-based models is the low accuracy of inter-session and inter-subject classification. Variability between sessions can reach up to 40% due to the inherent temporal variability of HD-sEMG signals. Targeting this challenge, the paper introduces the MoEMba framework, a novel approach leveraging Selective StateSpace Models (SSMs) to enhance HD-sEMG-based gesture recognition. The MoEMba framework captures temporal dependencies and cross-channel interactions through channel attention techniques. Furthermore, wavelet feature modulation is integrated to capture multi-scale temporal and spatial relations, improving signal representation. Experimental results on the CapgMyo HD-sEMG dataset demonstrate that MoEMba achieves a balanced accuracy of 56.9%, outperforming its state-of-the-art counterparts. The proposed framework's robustness to session-to-session variability and its efficient handling of high-dimensional multivariate time series data highlight its potential for advancing HD-sEMG-powered HCI systems.

MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition

TL;DR

Abstract

MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)