emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography
Viswanath Sivakumar, Jeffrey Seely, Alan Du, Sean R Bittner, Adam Berenzweig, Anuoluwapo Bolarinwa, Alexandre Gramfort, Michael I Mandel
TL;DR
emg2qwerty presents the largest public wrist sEMG dataset for touch typing on a QWERTY keyboard, enabling study of cross-subject and cross-session generalization with ground-truth keystrokes across 108 users and 1,135 sessions. The authors adapt ASR-style baselines, including a Time-Depth Separable ConvNet and CTC loss, augmented with SpecAugment, rotation-invariance modules, and a 6-gram language model to transduce sEMG signals into keystrokes, achieving strong personalization gains that reduce CER from over 50% to as low as 3.16% for some users. The work quantifies domain shift challenges in sEMG typing and demonstrates the necessity of personalization and language modeling for practical usability, while providing reproducible baselines and code for community use. This dataset and these baselines lay groundwork for private, high-bandwidth neuromotor interfaces in AR/VR and beyond, and invite future directions in domain adaptation, self-supervised learning, and wearable-friendly edge inference. The resource lowers barriers to progress in both ML and neuroscience communities by offering a scalable benchmark with well-defined tasks, metrics, and open-source tooling.
Abstract
Surface electromyography (sEMG) non-invasively measures signals generated by muscle activity with sufficient sensitivity to detect individual spinal neurons and richness to identify dozens of gestures and their nuances. Wearable wrist-based sEMG sensors have the potential to offer low friction, subtle, information rich, always available human-computer inputs. To this end, we introduce emg2qwerty, a large-scale dataset of non-invasive electromyographic signals recorded at the wrists while touch typing on a QWERTY keyboard, together with ground-truth annotations and reproducible baselines. With 1,135 sessions spanning 108 users and 346 hours of recording, this is the largest such public dataset to date. These data demonstrate non-trivial, but well defined hierarchical relationships both in terms of the generative process, from neurons to muscles and muscle combinations, as well as in terms of domain shift across users and user sessions. Applying standard modeling techniques from the closely related field of Automatic Speech Recognition (ASR), we show strong baseline performance on predicting key-presses using sEMG signals alone. We believe the richness of this task and dataset will facilitate progress in several problems of interest to both the machine learning and neuroscientific communities. Dataset and code can be accessed at https://github.com/facebookresearch/emg2qwerty.
