VAE-Based Synthetic EMG Generation with Mix-Consistency Loss for Recognizing Unseen Motion Combinations
Itsuki Yazawa, Akira Furui
TL;DR
This work tackles EMG-based motion classification for unseen combined motions, where linear input-space mixing fails due to nonlinear neuromuscular phenomena. It introduces a variational autoencoder with a mix-consistency loss ($L_{ ext{mix}}$) to structure a latent space in which combined motions lie between their basic components, enabling synthetic data generation via latent-space convex combinations and subsequent classifier training. The method employs a two-stage training framework and Dirichlet-distributed mixing with latent dimension $Z=6$, achieving about $78\%$ overall accuracy on unseen combinations, outperforming an input-space Mixup baseline (~$44\%$) and approaching fully-supervised performance (~$85\%$). The approach reduces data-collection burdens for multi-DOF EMG control and lays groundwork for cross-subject transfer learning, moving toward calibration-free prosthetic interfaces; however, a remaining gap to fully supervised performance and the need for subject-specific data in VAE training highlight areas for future work, including transfer learning across users.
Abstract
Electromyogram (EMG)-based motion classification using machine learning has been widely employed in applications such as prosthesis control. While previous studies have explored generating synthetic patterns of combined motions to reduce training data requirements, these methods assume that combined motions can be represented as linear combinations of basic motions. However, this assumption often fails due to complex neuromuscular phenomena such as muscle co-contraction, resulting in low-fidelity synthetic signals and degraded classification performance. To address this limitation, we propose a novel method that learns to synthesize combined motion patterns in a structured latent space. Specifically, we employ a variational autoencoder (VAE) to encode EMG signals into a low-dimensional representation and introduce a mixconsistency loss that structures the latent space such that combined motions are embedded between their constituent basic motions. Synthetic patterns are then generated within this structured latent space and used to train classifiers for recognizing unseen combined motions. We validated our approach through upper-limb motion classification experiments with eight healthy participants. The results demonstrate that our method outperforms input-space synthesis approaches, achieving approximately 30% improvement in accuracy.
