Table of Contents
Fetching ...

Typing Reinvented: Towards Hands-Free Input via sEMG

Kunwoo Lee, Dhivya Sreedhar, Pushkar Saraf, Chaeeun Lee, Kateryna Shapovalenko

TL;DR

This work investigates hands-free typing via non-invasive sEMG signals for spatial computing and VR. It systematically replaces a convolutional baseline with attention-based encoders (Transformer and Conformer) trained with $CTC$ loss and evaluated under fully causal, online conditions, augmented by LM-based corrections. The Conformer achieves the best generic performance with online $CER = 20.34\%$ and offline personalized $CER = 10.10\%$, while LM-assisted decoding improves personalization (e.g., GPT-4 Turbo sentence-level correction yields $CER_{personalized,offline} = 22.71\%$). The results demonstrate the feasibility of accurate, real-time, muscle-driven typing and point to latency reduction as the primary path toward deployment in wearable/spatial interfaces.

Abstract

We explore surface electromyography (sEMG) as a non-invasive input modality for mapping muscle activity to keyboard inputs, targeting immersive typing in next-generation human-computer interaction (HCI). This is especially relevant for spatial computing and virtual reality (VR), where traditional keyboards are impractical. Using attention-based architectures, we significantly outperform the existing convolutional baselines, reducing online generic CER from 24.98% -> 20.34% and offline personalized CER from 10.86% -> 10.10%, while remaining fully causal. We further incorporate a lightweight decoding pipeline with language-model-based correction, demonstrating the feasibility of accurate, real-time muscle-driven text input for future wearable and spatial interfaces.

Typing Reinvented: Towards Hands-Free Input via sEMG

TL;DR

This work investigates hands-free typing via non-invasive sEMG signals for spatial computing and VR. It systematically replaces a convolutional baseline with attention-based encoders (Transformer and Conformer) trained with loss and evaluated under fully causal, online conditions, augmented by LM-based corrections. The Conformer achieves the best generic performance with online and offline personalized , while LM-assisted decoding improves personalization (e.g., GPT-4 Turbo sentence-level correction yields ). The results demonstrate the feasibility of accurate, real-time, muscle-driven typing and point to latency reduction as the primary path toward deployment in wearable/spatial interfaces.

Abstract

We explore surface electromyography (sEMG) as a non-invasive input modality for mapping muscle activity to keyboard inputs, targeting immersive typing in next-generation human-computer interaction (HCI). This is especially relevant for spatial computing and virtual reality (VR), where traditional keyboards are impractical. Using attention-based architectures, we significantly outperform the existing convolutional baselines, reducing online generic CER from 24.98% -> 20.34% and offline personalized CER from 10.86% -> 10.10%, while remaining fully causal. We further incorporate a lightweight decoding pipeline with language-model-based correction, demonstrating the feasibility of accurate, real-time muscle-driven text input for future wearable and spatial interfaces.

Paper Structure

This paper contains 13 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Dataset overview (emg2qwerty)
  • Figure 2: Inter-channel correlation
  • Figure 3: Log spectrogram of the signal
  • Figure 4: Proposed models diagram
  • Figure 5: Proposed inference mechanism
  • ...and 1 more figures