Table of Contents
Fetching ...

From Continuous sEMG Signals to Discrete Muscle State Tokens: A Robust and Interpretable Representation Framework

Yuepeng Chen, Kaili Zheng, Ji Wu, Zhuangzhuang Li, Ye Ma, Dongwei Liu, Chenyi Guo, Xiangling Fu

TL;DR

The effectiveness of tokenized sEMG representations as a compact, generalizable, and physiologically meaningful feature space for applications in rehabilitation, human-machine interaction, and motor function analysis is highlighted.

Abstract

Surface electromyography (sEMG) signals exhibit substantial inter-subject variability and are highly susceptible to noise, posing challenges for robust and interpretable decoding. To address these limitations, we propose a discrete representation of sEMG signals based on a physiology-informed tokenization framework. The method employs a sliding window aligned with the minimal muscle contraction cycle to isolate individual muscle activation events. From each window, ten time-frequency features, including root mean square (RMS) and median frequency (MDF), are extracted, and K-means clustering is applied to group segments into representative muscle-state tokens. We also introduce a large-scale benchmark dataset, ActionEMG-43, comprising 43 diverse actions and sEMG recordings from 16 major muscle groups across the body. Based on this dataset, we conduct extensive evaluations to assess the inter-subject consistency, representation capacity, and interpretability of the proposed sEMG tokens. Our results show that the token representation exhibits high inter-subject consistency (Cohen's Kappa = 0.82+-0.09), indicating that the learned tokens capture consistent and subject-independent muscle activation patterns. In action recognition tasks, models using sEMG tokens achieve Top-1 accuracies of 75.5% with ViT and 67.9% with SVM, outperforming raw-signal baselines (72.8% and 64.4%, respectively), despite a 96% reduction in input dimensionality. In movement quality assessment, the tokens intuitively reveal patterns of muscle underactivation and compensatory activation, offering interpretable insights into neuromuscular control. Together, these findings highlight the effectiveness of tokenized sEMG representations as a compact, generalizable, and physiologically meaningful feature space for applications in rehabilitation, human-machine interaction, and motor function analysis.

From Continuous sEMG Signals to Discrete Muscle State Tokens: A Robust and Interpretable Representation Framework

TL;DR

The effectiveness of tokenized sEMG representations as a compact, generalizable, and physiologically meaningful feature space for applications in rehabilitation, human-machine interaction, and motor function analysis is highlighted.

Abstract

Surface electromyography (sEMG) signals exhibit substantial inter-subject variability and are highly susceptible to noise, posing challenges for robust and interpretable decoding. To address these limitations, we propose a discrete representation of sEMG signals based on a physiology-informed tokenization framework. The method employs a sliding window aligned with the minimal muscle contraction cycle to isolate individual muscle activation events. From each window, ten time-frequency features, including root mean square (RMS) and median frequency (MDF), are extracted, and K-means clustering is applied to group segments into representative muscle-state tokens. We also introduce a large-scale benchmark dataset, ActionEMG-43, comprising 43 diverse actions and sEMG recordings from 16 major muscle groups across the body. Based on this dataset, we conduct extensive evaluations to assess the inter-subject consistency, representation capacity, and interpretability of the proposed sEMG tokens. Our results show that the token representation exhibits high inter-subject consistency (Cohen's Kappa = 0.82+-0.09), indicating that the learned tokens capture consistent and subject-independent muscle activation patterns. In action recognition tasks, models using sEMG tokens achieve Top-1 accuracies of 75.5% with ViT and 67.9% with SVM, outperforming raw-signal baselines (72.8% and 64.4%, respectively), despite a 96% reduction in input dimensionality. In movement quality assessment, the tokens intuitively reveal patterns of muscle underactivation and compensatory activation, offering interpretable insights into neuromuscular control. Together, these findings highlight the effectiveness of tokenized sEMG representations as a compact, generalizable, and physiologically meaningful feature space for applications in rehabilitation, human-machine interaction, and motor function analysis.
Paper Structure (21 sections, 8 equations, 10 figures, 6 tables)

This paper contains 21 sections, 8 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Overview of the sEMG tokenization process. sEMG signals are segmented using a sliding window, and features of all segments are clustered via K-means to generate a codebook of representative muscle activation states. New sEMG data are mapped to discrete tokens by assigning each segment to the nearest cluster centroid in the pre-established codebook, resulting in a symbolic representation that enhances inter-subject consistency, compactness and interpretability for downstream tasks.
  • Figure 2: Muscle Groups for sEMG Signals Collection in ActionEMG-43. sEMG signals were recorded from 16 major muscle groups across the upper limbs, lower limbs, and trunk using Delsys Trigno wireless sensors. This comprehensive coverage enables detailed analysis of whole-body motor coordination and muscle activation patterns across a wide range of actions.
  • Figure 3: Trends of SSE and PNMI with respect to the number of clusters $K$ (ranging from 2 to 25). Curve shows the average result over five-fold cross-validation, and the shaded area represents the standard deviation across folds.
  • Figure 4: SVM-based human action recognition model. Time-frequency features from raw sEMG signals or statistical descriptors from sEMG tokens are flattened into one-dimensional vectors and used as input for classification.
  • Figure 5: ViT-based human action recognition model. For 16 muscles, the input sequence is either the raw sEMG signals or the corresponding sEMG tokens sequence. The sequence is divided into several temporal patches. Each patch is passed through a linear projection and positional embedding layer, and a class token is added to the beginning of the sequence. The resulting sequence is then processed by the transformer encoder. Raw sEMG signals are smoothed using a first-order low-pass Butterworth filter before segmentation.
  • ...and 5 more figures