Table of Contents
Fetching ...

TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement

Kuan-Chen Wang, Kai-Chun Liu, Ping-Cheng Yeh, Sheng-Yu Peng, Yu Tsao

TL;DR

This work tackles the problem of robustly denoising surface sEMG signals contaminated by diverse nonstationary noises. It introduces TrustEMG-Net, an end-to-end denoising autoencoder that fuses a U-Net with a Transformer encoder in a representation-masking (RM) framework to capture local and global signal structure. Across Ninapro DB2 with five contaminants and varying SNRs, TrustEMG-Net outperforms traditional IIR/TS/decomposition methods and several NN baselines on multiple signal-quality and feature-extraction metrics, demonstrating strong generalization to unseen contaminant types. The approach offers a robust, generalizable denoising solution for sEMG applications in clinical and human-computer interaction domains, with ablation results highlighting the critical role of the U-Net structure and RM transformer in achieving superior performance.

Abstract

Surface electromyography (sEMG) is a widely employed bio-signal that captures human muscle activity via electrodes placed on the skin. Several studies have proposed methods to remove sEMG contaminants, as non-invasive measurements render sEMG susceptible to various contaminants. However, these approaches often rely on heuristic-based optimization and are sensitive to the contaminant type. A more potent, robust, and generalized sEMG denoising approach should be developed for various healthcare and human-computer interaction applications. This paper proposes a novel neural network (NN)-based sEMG denoising method called TrustEMG-Net. It leverages the potent nonlinear mapping capability and data-driven nature of NNs. TrustEMG-Net adopts a denoising autoencoder structure by combining U-Net with a Transformer encoder using a representation-masking approach. The proposed approach is evaluated using the Ninapro sEMG database with five common contamination types and signal-to-noise ratio (SNR) conditions. Compared with existing sEMG denoising methods, TrustEMG-Net achieves exceptional performance across the five evaluation metrics, exhibiting a minimum improvement of 20%. Its superiority is consistent under various conditions, including SNRs ranging from -14 to 2 dB and five contaminant types. An ablation study further proves that the design of TrustEMG-Net contributes to its optimality, providing high-quality sEMG and serving as an effective, robust, and generalized denoising solution for sEMG applications.

TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement

TL;DR

This work tackles the problem of robustly denoising surface sEMG signals contaminated by diverse nonstationary noises. It introduces TrustEMG-Net, an end-to-end denoising autoencoder that fuses a U-Net with a Transformer encoder in a representation-masking (RM) framework to capture local and global signal structure. Across Ninapro DB2 with five contaminants and varying SNRs, TrustEMG-Net outperforms traditional IIR/TS/decomposition methods and several NN baselines on multiple signal-quality and feature-extraction metrics, demonstrating strong generalization to unseen contaminant types. The approach offers a robust, generalizable denoising solution for sEMG applications in clinical and human-computer interaction domains, with ablation results highlighting the critical role of the U-Net structure and RM transformer in achieving superior performance.

Abstract

Surface electromyography (sEMG) is a widely employed bio-signal that captures human muscle activity via electrodes placed on the skin. Several studies have proposed methods to remove sEMG contaminants, as non-invasive measurements render sEMG susceptible to various contaminants. However, these approaches often rely on heuristic-based optimization and are sensitive to the contaminant type. A more potent, robust, and generalized sEMG denoising approach should be developed for various healthcare and human-computer interaction applications. This paper proposes a novel neural network (NN)-based sEMG denoising method called TrustEMG-Net. It leverages the potent nonlinear mapping capability and data-driven nature of NNs. TrustEMG-Net adopts a denoising autoencoder structure by combining U-Net with a Transformer encoder using a representation-masking approach. The proposed approach is evaluated using the Ninapro sEMG database with five common contamination types and signal-to-noise ratio (SNR) conditions. Compared with existing sEMG denoising methods, TrustEMG-Net achieves exceptional performance across the five evaluation metrics, exhibiting a minimum improvement of 20%. Its superiority is consistent under various conditions, including SNRs ranging from -14 to 2 dB and five contaminant types. An ablation study further proves that the design of TrustEMG-Net contributes to its optimality, providing high-quality sEMG and serving as an effective, robust, and generalized denoising solution for sEMG applications.
Paper Structure (39 sections, 12 equations, 6 figures, 7 tables)

This paper contains 39 sections, 12 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The architecture of the proposed TrustEMG-Net.
  • Figure 2: Illustration of the (a) representation masking and (b) direct mapping approaches.
  • Figure 3: Performance under different SNR inputs measured using (a) SNR$_{imp}$, (b) RMSE, (c) PRD, (d) RMSE of the ARV, and (e) RMSE of the MF.
  • Figure 4: Waveforms of (a) noisy sEMG and enhanced sEMG using (b) TrustEMG-Net, (c) IIR filter, (d) EMD-based method, (e) CEEMDAN-based method, and (f) VMD-based method. The noisy sEMG segment was at SNR of -0.47 dB, extracted from the 2-s noisy sEMG corrupted with WGN at SNR -2 dB. TrustEMG-Net effectively removed contaminants and reconstructed the sEMG, yielding the highest SNR among all the methods.
  • Figure 5: (a) Effect of the RM approach in TrustEMG-Net. The latent representation is derived from input sEMG contaminated by PLI at an SNR of -14 dB. The difference between the latent representation before and after masking indicates that the mask primarily highlights certain feature dimensions of the representation. The mask preserves features by assigning weights close to one, such as (b) feature 939, and it suppresses features by assigning weights close to zero, such as (c) feature 949.
  • ...and 1 more figures