TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge
Matteo Fasulo, Giusy Spacone, Thorir Mar Ingolfsson, Yawei Li, Luca Benini, Andrea Cossettini
TL;DR
This work tackles the generalization and edge-deployability gap in EMG processing by introducing TinyMyo, a compact 3.6M-parameter Transformer encoder pre-trained with self-supervised masked reconstruction on diverse EMG datasets. The model supports multiple downstream tasks—gesture classification, kinematic regression, and speech-related tasks—through lightweight task-specific heads, achieving state-of-the-art or competitive results on NinaPro DB5, EPN-612, UCI-EMG, Ninapro DB8, and the Gaddy Silent Speech Dataset. Importantly, TinyMyo is demonstrated on an ultra-low-power GAP9 MCU with an average power envelope of $36.45\text{mW}$ and 12.2 s latency, highlighting practical edge deployment. The authors also open-source the pre-trained backbone and downstream architectures to accelerate future EMG research and standardize a foundation for diverse sensing configurations.
Abstract
Surface electromyography (EMG) is a non-invasive sensing modality used in several domains, including biomechanics, rehabilitation, prosthetic control, and emerging human-machine interaction paradigms. Despite decades of use, significant challenges remain in achieving robust generalization across subjects, recording systems, and acquisition protocols. To tackle these challenges, foundation models (FMs) are gaining traction when targeting end-to-end applications based on EMG signals. Yet, existing EMG FMs remain limited to single downstream tasks and lack deployability on embedded platforms. In this work, we present TinyMyo, a lightweight FM based on a Transformer encoder architecture. The model is pre-trained in a self-supervised manner on publicly available datasets and achieves high reconstruction fidelity with only 3.6M parameters. With minimal task-specific head adaptations, the same backbone is used to tackle multiple downstream tasks, leveraging datasets acquired from diverse sensing locations and hardware platforms. We demonstrate generalization across hand gesture classification, hand kinematic regression, speech production and recognition, with performance comparable to or surpassing the state of the art (SoA), and model size below 5M parameters. We achieve SoA results compared to previous FM-based works on the NinaPro DB5 ($89.4\pm0.16\%$), UCI-EMG ($97.56\pm0.32\%$), and EPN-612 ($96.74\pm0.09\%$) datasets. We report, to the best of our knowledge, the first deployment of an EMG FM on an ultra-low-power microcontroller (GAP9), achieving an average power envelope of 36.45mW. By open-sourcing the pre-trained and the downstream task architectures (https://github.com/pulp-bio/BioFoundation), we aim to provide a flexible resource that can accelerate future research and serve as a common foundation for the EMG community.
