Table of Contents
Fetching ...

TA-GNN: Physics Inspired Time-Agnostic Graph Neural Network for Finger Motion Prediction

Tinghui Li, Pamuditha Somarathne, Zhanna Sarsenbayeva, Anusha Withana

TL;DR

TA-GNN tackles finger motion prediction by integrating physics-informed kinematics with a time-agnostic learning framework. The architecture comprises a kinematic feature extractor for angular velocity $\omega$ and angular acceleration $\alpha$, a physics-based encoder that applies a kinematic expansion to predict displacement $\delta\theta(t)$, and a graph-based decoder that captures inter-joint topology to output final finger rotations $\hat{\theta}(t)$. It trains to predict multiple horizons simultaneously for $t \in [40,80,\dots,400]$ ms, enabling continuous, horizon-independent inference without per-time retraining. Evaluations on the VRHands VR dataset and the Re:InterHand dataset demonstrate substantial improvements over baselines and highlight the approach’s potential for sensor-free predictive interactions and enhanced rendering quality in VR.

Abstract

Continuous prediction of finger joint movement using historical joint positions/rotations is vital in a multitude of applications, especially related to virtual reality, computer graphics, robotics, and rehabilitation. However, finger motions are highly articulated with multiple degrees of freedom, making them significantly harder to model and predict. To address this challenge, we propose a physics-inspired time-agnostic graph neural network (TA-GNN) to accurately predict human finger motions. The proposed encoder comprises a kinematic feature extractor to generate filtered velocity and acceleration and a physics-based encoder that follows linear kinematics. The model is designed to be prediction-time-agnostic so that it can seamlessly provide continuous predictions. The graph-based decoder for learning the topological motion between finger joints is designed to address the higher degree articulation of fingers. We show the superiority of our model performance in virtual reality context. This novel approach enhances finger tracking without additional sensors, enabling predictive interactions such as haptic re-targeting and improving predictive rendering quality.

TA-GNN: Physics Inspired Time-Agnostic Graph Neural Network for Finger Motion Prediction

TL;DR

TA-GNN tackles finger motion prediction by integrating physics-informed kinematics with a time-agnostic learning framework. The architecture comprises a kinematic feature extractor for angular velocity and angular acceleration , a physics-based encoder that applies a kinematic expansion to predict displacement , and a graph-based decoder that captures inter-joint topology to output final finger rotations . It trains to predict multiple horizons simultaneously for ms, enabling continuous, horizon-independent inference without per-time retraining. Evaluations on the VRHands VR dataset and the Re:InterHand dataset demonstrate substantial improvements over baselines and highlight the approach’s potential for sensor-free predictive interactions and enhanced rendering quality in VR.

Abstract

Continuous prediction of finger joint movement using historical joint positions/rotations is vital in a multitude of applications, especially related to virtual reality, computer graphics, robotics, and rehabilitation. However, finger motions are highly articulated with multiple degrees of freedom, making them significantly harder to model and predict. To address this challenge, we propose a physics-inspired time-agnostic graph neural network (TA-GNN) to accurately predict human finger motions. The proposed encoder comprises a kinematic feature extractor to generate filtered velocity and acceleration and a physics-based encoder that follows linear kinematics. The model is designed to be prediction-time-agnostic so that it can seamlessly provide continuous predictions. The graph-based decoder for learning the topological motion between finger joints is designed to address the higher degree articulation of fingers. We show the superiority of our model performance in virtual reality context. This novel approach enhances finger tracking without additional sensors, enabling predictive interactions such as haptic re-targeting and improving predictive rendering quality.

Paper Structure

This paper contains 33 sections, 8 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: The joint rotation prediction process. The gray joints show the previous actual position of the finger. The red joints show the prediction ($\hat{\theta}(t_0 + t)$) for $t$ milliseconds into the future from a given time $t_0$, while the black joints show the actual position of the finger ($\theta(t_0 + t)$) in the future.
  • Figure 2: The difference that LSTM layers used to correct derivatives in the kinematic feature extractor. $V_{approx}$ (red arrow) represents the approximate value and $V_{true}$ (blue arrow) represents the actual value. $t_s$ is the sampling interval.
  • Figure 3: The setup of collecting VRHands dataset through Meta Quest 2 headset: a) the right perspective of the real situation, b) the top perspective of the real situation, and c) the real-time Unity interface when collecting data.
  • Figure 4: The qualitative results for different models. The time interval is 40 ms. The activities are randomly selected from the movement. The participant (a) made fists, (b) flexed their fingers, and (c) holding an object.
  • Figure 5: The qualitative results for different models. The activities are randomly selected from the movement. The participant (a) flexed fingers; (b) made pose; (c) made fist; and (d) made pose.