Table of Contents
Fetching ...

CaptAinGlove: Capacitive and Inertial Fusion-Based Glove for Real-Time on Edge Hand Gesture Recognition for Drone Control

Hymalai Bello, Sungho Suh, Daniel Geißler, Lala Ray, Bo Zhou, Paul Lukowicz

TL;DR

CaptAinGlove tackles privacy concerns and power constraints in drone-control gesture recognition by fusing textile capacitive sensing with a wrist IMU in a two-stage edge ML pipeline. The approach uses lightweight CNNs and TensorFlow Lite for MCU to perform real-time on-device inference, achieving offline F1 of 80% across nine classes and 67% in real-time experiments on a single user, with a memory footprint around 2 MB and total power near 1.15 W. This work demonstrates a practical, wearable, privacy-preserving alternative to camera-based systems and highlights substantial power savings from hierarchical fusion, enabling broader deployment in industrial and consumer drone applications. It lays groundwork for multi-user validation, latency optimizations, and extension to other hand-based HRI tasks.

Abstract

We present CaptAinGlove, a textile-based, low-power (1.15Watts), privacy-conscious, real-time on-the-edge (RTE) glove-based solution with a tiny memory footprint (2MB), designed to recognize hand gestures used for drone control. We employ lightweight convolutional neural networks as the backbone models and a hierarchical multimodal fusion to reduce power consumption and improve accuracy. The system yields an F1-score of 80% for the offline evaluation of nine classes; eight hand gesture commands and null activity. For the RTE, we obtained an F1-score of 67% (one user).

CaptAinGlove: Capacitive and Inertial Fusion-Based Glove for Real-Time on Edge Hand Gesture Recognition for Drone Control

TL;DR

CaptAinGlove tackles privacy concerns and power constraints in drone-control gesture recognition by fusing textile capacitive sensing with a wrist IMU in a two-stage edge ML pipeline. The approach uses lightweight CNNs and TensorFlow Lite for MCU to perform real-time on-device inference, achieving offline F1 of 80% across nine classes and 67% in real-time experiments on a single user, with a memory footprint around 2 MB and total power near 1.15 W. This work demonstrates a practical, wearable, privacy-preserving alternative to camera-based systems and highlights substantial power savings from hierarchical fusion, enabling broader deployment in industrial and consumer drone applications. It lays groundwork for multi-user validation, latency optimizations, and extension to other hand-based HRI tasks.

Abstract

We present CaptAinGlove, a textile-based, low-power (1.15Watts), privacy-conscious, real-time on-the-edge (RTE) glove-based solution with a tiny memory footprint (2MB), designed to recognize hand gestures used for drone control. We employ lightweight convolutional neural networks as the backbone models and a hierarchical multimodal fusion to reduce power consumption and improve accuracy. The system yields an F1-score of 80% for the offline evaluation of nine classes; eight hand gesture commands and null activity. For the RTE, we obtained an F1-score of 67% (one user).
Paper Structure (6 sections, 3 figures, 1 table)

This paper contains 6 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: CaptAinGlove Prototype; Showing the Capacitive Channels and IMU Positions on the Sports Glove (A). Hardware Block Diagram; Sensing Connections to Main Board (Portenta H7) and PC(B). Hand Gestures for Drone Control Dictionary Kiselov_2021(C).
  • Figure 2: RTE Implementation for Hand Gesture Recognition(A). Results of the offline Capacitive Model; Null(0), Up(1), Down(2), Back(3), Forward(4), Land(5), Stop(6), Left(7), Right(8) and F1-score=80%(B). Real-Time on the Edge Results of Capacitive Model; F1-score=67%(C).
  • Figure 3: Example of Smoothing Temporal Windows for Continuous Recognition ContinuosRecognition