Table of Contents
Fetching ...

Ring-a-Pose: A Ring for Continuous Hand Pose Tracking

Tianhong Catherine Yu, Guilin Hu, Ruidong Zhang, Hyunchul Lim, Saif Mahmud, Chi-Jung Lee, Ke Li, Devansh Agarwal, Shuyang Nie, Jinseok Oh, François Guimbretière, Cheng Zhang

TL;DR

Ring-a-Pose enables the future of smart rings to track and recognize hand poses using relatively low-power acoustic sensing using relatively low-power acoustic sensing.

Abstract

We present Ring-a-Pose, a single untethered ring that tracks continuous 3D hand poses. Located in the center of the hand, the ring emits an inaudible acoustic signal that each hand pose reflects differently. Ring-a-Pose imposes minimal obtrusions on the hand, unlike multi-ring or glove systems. It is not affected by the choice of clothing that may cover wrist-worn systems. In a series of three user studies with a total of 30 participants, we evaluate Ring-a-Pose's performance on pose tracking and micro-finger gesture recognition. Without collecting any training data from a user, Ring-a-Pose tracks continuous hand poses with a joint error of 14.1mm. The joint error decreases to 10.3mm for fine-tuned user-dependent models. Ring-a-Pose recognizes 7-class micro-gestures with a 90.60% and 99.27% accuracy for user-independent and user-dependent models, respectively. Furthermore, the ring exhibits promising performance when worn on any finger. Ring-a-Pose enables the future of smart rings to track and recognize hand poses using relatively low-power acoustic sensing.

Ring-a-Pose: A Ring for Continuous Hand Pose Tracking

TL;DR

Ring-a-Pose enables the future of smart rings to track and recognize hand poses using relatively low-power acoustic sensing using relatively low-power acoustic sensing.

Abstract

We present Ring-a-Pose, a single untethered ring that tracks continuous 3D hand poses. Located in the center of the hand, the ring emits an inaudible acoustic signal that each hand pose reflects differently. Ring-a-Pose imposes minimal obtrusions on the hand, unlike multi-ring or glove systems. It is not affected by the choice of clothing that may cover wrist-worn systems. In a series of three user studies with a total of 30 participants, we evaluate Ring-a-Pose's performance on pose tracking and micro-finger gesture recognition. Without collecting any training data from a user, Ring-a-Pose tracks continuous hand poses with a joint error of 14.1mm. The joint error decreases to 10.3mm for fine-tuned user-dependent models. Ring-a-Pose recognizes 7-class micro-gestures with a 90.60% and 99.27% accuracy for user-independent and user-dependent models, respectively. Furthermore, the ring exhibits promising performance when worn on any finger. Ring-a-Pose enables the future of smart rings to track and recognize hand poses using relatively low-power acoustic sensing.
Paper Structure (64 sections, 4 equations, 13 figures, 6 tables)

This paper contains 64 sections, 4 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: (a) The untethered Ring-a-Pose prototype used in the user studies. (b) The Ring-a-Pose prototype electronics without the case. (c) A physical mockup of future Ring-a-Pose. We replaced the battery with an arc battery and removed the MCU. (d) Details of Ring-a-Pose's PCBs.
  • Figure 2: Echo Frame Calculation. The cross-correlation (orange line) between the transmitted FMCW signal (blue) and the received signal (green) is mapped from the time domain to the distance domain as an echo frame. Ring-a-Pose crops 54 pixels, equivalent to 18.52cm, of the echo frame to analyze hand poses. The black lines in the 3D visualization overlayed on the hand denote a 3cm distance increment. The color of each sphere denotes the summation of reflection strengths from that radius. Note the spectrograms on the left are for visualization purposes, the signals are captured and processed with the time domain.
  • Figure 3: Original and Differential Echo Profile for a sequence of Hand Poses. The black lines in the 3D echo frames visualization overlayed on the hand denote a 3cm distance increment. The color of each sphere denotes the summation of reflection strengths from the travel path lengths of the radius of the sphere.
  • Figure 4: Encoder-decoder Architecture for Hand Pose Tracking and Gesture Classification. Example visualized inputs have the differential echo profile channel in the front and the original echo profile channel in the back.
  • Figure 5: The twenty terminal hand poses evaluated in our hand pose tracking study. Poses are labeled blue or green based on whether the hand geometries occlude the sensor when the ring is worn on the middle finger. The four rows show (1) reference images displayed during the user study, (2) example MediaPipe ground truths of a participant, visualized using MANO mano, (3) example predictions using the fine-tuned model, and (4) example predictions using the user-independent model. (3) and (4) share the same timestamps as (2).
  • ...and 8 more figures