Resource-Efficient Gesture Recognition through Convexified Attention
Daniel Schwartz, Dario Salvucci, Yusuf Osmanlioglu, Richard Vallett, Genevieve Dion, Ali Shokoufandeh
TL;DR
The paper addresses gesture recognition for resource-constrained wearable textiles by introducing a convexified attention mechanism embedded in a Convexified Neural Network. It replaces non-convex softmax attention with a convex projection onto the probability simplex and uses convex losses together with nuclear-norm regularization, enabling end-to-end convex optimization on microcontrollers. The approach achieves 100% accuracy for tap and swipe gestures with only 120–360 parameters, runs in sub-millisecond latency on Arduino hardware, and uses a tiny storage footprint, all while providing global convergence guarantees. While demonstrated on a single user's textile sensor data, the work shows strong potential for on-device, energy-efficient human–machine interfaces in wearables and sets the stage for broader validation and extension to richer gesture vocabularies.
Abstract
Wearable e-textile interfaces require gesture recognition capabilities but face severe constraints in power consumption, computational capacity, and form factor that make traditional deep learning impractical. While lightweight architectures like MobileNet improve efficiency, they still demand thousands of parameters, limiting deployment on textile-integrated platforms. We introduce a convexified attention mechanism for wearable applications that dynamically weights features while preserving convexity through nonexpansive simplex projection and convex loss functions. Unlike conventional attention mechanisms using non-convex softmax operations, our approach employs Euclidean projection onto the probability simplex combined with multi-class hinge loss, ensuring global convergence guarantees. Implemented on a textile-based capacitive sensor with four connection points, our approach achieves 100.00\% accuracy on tap gestures and 100.00\% on swipe gestures -- consistent across 10-fold cross-validation and held-out test evaluation -- while requiring only 120--360 parameters, a 97\% reduction compared to conventional approaches. With sub-millisecond inference times (290--296$μ$s) and minimal storage requirements ($<$7KB), our method enables gesture interfaces directly within e-textiles without external processing. Our evaluation, conducted in controlled laboratory conditions with a single-user dataset, demonstrates feasibility for basic gesture interactions. Real-world deployment would require validation across multiple users, environmental conditions, and more complex gesture vocabularies. These results demonstrate how convex optimization can enable efficient on-device machine learning for textile interfaces.
