Table of Contents
Fetching ...

Resource-Efficient Gesture Recognition through Convexified Attention

Daniel Schwartz, Dario Salvucci, Yusuf Osmanlioglu, Richard Vallett, Genevieve Dion, Ali Shokoufandeh

TL;DR

The paper addresses gesture recognition for resource-constrained wearable textiles by introducing a convexified attention mechanism embedded in a Convexified Neural Network. It replaces non-convex softmax attention with a convex projection onto the probability simplex and uses convex losses together with nuclear-norm regularization, enabling end-to-end convex optimization on microcontrollers. The approach achieves 100% accuracy for tap and swipe gestures with only 120–360 parameters, runs in sub-millisecond latency on Arduino hardware, and uses a tiny storage footprint, all while providing global convergence guarantees. While demonstrated on a single user's textile sensor data, the work shows strong potential for on-device, energy-efficient human–machine interfaces in wearables and sets the stage for broader validation and extension to richer gesture vocabularies.

Abstract

Wearable e-textile interfaces require gesture recognition capabilities but face severe constraints in power consumption, computational capacity, and form factor that make traditional deep learning impractical. While lightweight architectures like MobileNet improve efficiency, they still demand thousands of parameters, limiting deployment on textile-integrated platforms. We introduce a convexified attention mechanism for wearable applications that dynamically weights features while preserving convexity through nonexpansive simplex projection and convex loss functions. Unlike conventional attention mechanisms using non-convex softmax operations, our approach employs Euclidean projection onto the probability simplex combined with multi-class hinge loss, ensuring global convergence guarantees. Implemented on a textile-based capacitive sensor with four connection points, our approach achieves 100.00\% accuracy on tap gestures and 100.00\% on swipe gestures -- consistent across 10-fold cross-validation and held-out test evaluation -- while requiring only 120--360 parameters, a 97\% reduction compared to conventional approaches. With sub-millisecond inference times (290--296$μ$s) and minimal storage requirements ($<$7KB), our method enables gesture interfaces directly within e-textiles without external processing. Our evaluation, conducted in controlled laboratory conditions with a single-user dataset, demonstrates feasibility for basic gesture interactions. Real-world deployment would require validation across multiple users, environmental conditions, and more complex gesture vocabularies. These results demonstrate how convex optimization can enable efficient on-device machine learning for textile interfaces.

Resource-Efficient Gesture Recognition through Convexified Attention

TL;DR

The paper addresses gesture recognition for resource-constrained wearable textiles by introducing a convexified attention mechanism embedded in a Convexified Neural Network. It replaces non-convex softmax attention with a convex projection onto the probability simplex and uses convex losses together with nuclear-norm regularization, enabling end-to-end convex optimization on microcontrollers. The approach achieves 100% accuracy for tap and swipe gestures with only 120–360 parameters, runs in sub-millisecond latency on Arduino hardware, and uses a tiny storage footprint, all while providing global convergence guarantees. While demonstrated on a single user's textile sensor data, the work shows strong potential for on-device, energy-efficient human–machine interfaces in wearables and sets the stage for broader validation and extension to richer gesture vocabularies.

Abstract

Wearable e-textile interfaces require gesture recognition capabilities but face severe constraints in power consumption, computational capacity, and form factor that make traditional deep learning impractical. While lightweight architectures like MobileNet improve efficiency, they still demand thousands of parameters, limiting deployment on textile-integrated platforms. We introduce a convexified attention mechanism for wearable applications that dynamically weights features while preserving convexity through nonexpansive simplex projection and convex loss functions. Unlike conventional attention mechanisms using non-convex softmax operations, our approach employs Euclidean projection onto the probability simplex combined with multi-class hinge loss, ensuring global convergence guarantees. Implemented on a textile-based capacitive sensor with four connection points, our approach achieves 100.00\% accuracy on tap gestures and 100.00\% on swipe gestures -- consistent across 10-fold cross-validation and held-out test evaluation -- while requiring only 120--360 parameters, a 97\% reduction compared to conventional approaches. With sub-millisecond inference times (290--296s) and minimal storage requirements (7KB), our method enables gesture interfaces directly within e-textiles without external processing. Our evaluation, conducted in controlled laboratory conditions with a single-user dataset, demonstrates feasibility for basic gesture interactions. Real-world deployment would require validation across multiple users, environmental conditions, and more complex gesture vocabularies. These results demonstrate how convex optimization can enable efficient on-device machine learning for textile interfaces.
Paper Structure (37 sections, 5 theorems, 37 equations, 4 figures, 4 tables, 3 algorithms)

This paper contains 37 sections, 5 theorems, 37 equations, 4 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

Let $\Delta=\{x\in\mathbb{R}^P:\; x\ge 0,\ \mathbf{1}^\top x=1\}$ and let $\Pi_\Delta$ denote the Euclidean projection onto $\Delta$. Then:

Figures (4)

  • Figure 1: System architecture showing the complete pipeline from raw capacitive sensor input through classification output. Raw 4-channel electrode signals are first transformed via Random Fourier Features (RFF) to approximate RBF kernel mappings, enabling nonlinear pattern recognition within a linear framework. The convexified attention mechanism then computes class-specific weights via Euclidean projection onto the probability simplex (Algorithm \ref{['alg:simplex_projection']}), dynamically emphasizing temporally relevant patterns for each gesture class. Nuclear norm regularization promotes low-rank weight matrices, reducing effective parameters. The entire pipeline requires only 120--360 parameters and executes in under 300$\mu$s on Arduino Nano 33 BLE.
  • Figure 2: Capacitance signal patterns for tap and swipe gestures across four electrode channels.
  • Figure 3: CTS sensor configurations at different scales: 4"$\times$4", 8"$\times$4", and 8"$\times$8" form factors.
  • Figure 4: Parameter efficiency comparison across models. Our Convexified Attention approach (leftmost) requires $30-100\times$ fewer parameters than conventional efficiency-focused architectures, enabling deployment on ultra-low-power wearable platforms.

Theorems & Definitions (10)

  • Theorem 1: Projection onto the simplex: nonexpansiveness and convex distance
  • proof
  • Theorem 2: Convexity of multi-class hinge loss
  • proof
  • Theorem 3: Convexity of squared loss
  • proof
  • Theorem 4: Convexity of Nuclear Norm
  • proof
  • Theorem 5: End-to-end convexity under fixed or convexly-lifted attention
  • proof