Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis
Weiyu Guo, Ziyue Qiao, Ying Sun, Hui Xiong
TL;DR
This work tackles the challenge of noise resilience in surface EMG gesture recognition by prioritizing short-term information. It introduces STEM, a learnable module that captures local sEMG variations via sliding-window attention, and STET, which fuses long-term context with short-term signals through decoupled decoders and a fusion mechanism. A self-supervised pretraining regime (EIPC) with per-sensor masking and an asymmetric optimization objective further strengthens robustness and discrimination. Evaluations on GRABMyo and Ninapro DB2 show STET achieving superior accuracy and noise robustness, reducing the impact of noise by over 20% relative to strong baselines and demonstrating generalization across gesture recognition and regression tasks with practical deployment potential.
Abstract
Gesture recognition based on surface electromyography (sEMG) has been gaining importance in many 3D Interactive Scenes. However, sEMG is easily influenced by various forms of noise in real-world environments, leading to challenges in providing long-term stable interactions through sEMG. Existing methods often struggle to enhance model noise resilience through various predefined data augmentation techniques. In this work, we revisit the problem from a short term enhancement perspective to improve precision and robustness against various common noisy scenarios with learnable denoise using sEMG intrinsic pattern information and sliding-window attention. We propose a Short Term Enhancement Module(STEM) which can be easily integrated with various models. STEM offers several benefits: 1) Learnable denoise, enabling noise reduction without manual data augmentation; 2) Scalability, adaptable to various models; and 3) Cost-effectiveness, achieving short-term enhancement through minimal weight-sharing in an efficient attention mechanism. In particular, we incorporate STEM into a transformer, creating the Short Term Enhanced Transformer (STET). Compared with best-competing approaches, the impact of noise on STET is reduced by more than 20%. We also report promising results on both classification and regression datasets and demonstrate that STEM generalizes across different gesture recognition tasks.
