Table of Contents
Fetching ...

Towards Human Haptic Gesture Interpretation for Robotic Systems

Bibit Bianchini, Prateek Verma, Kenneth Salisbury

TL;DR

The paper addresses the lack of standardized tactile sensing for natural human-robot interactions by proposing a four-gesture dictionary (Tap, Touch, Grab, Slip) and collecting a UR5e wrist force-torque dataset to benchmark multiple feature sets and classifiers. It compares six feature representations, including two autoencoder-based bottlenecks and three manual feature sets, across three classifier families, concluding that neural networks trained on raw force data achieve the best test accuracy (81%), with competitive results aligning with prior literature when gestures are mapped appropriately. The work demonstrates that simple, common force-torque sensing can rival more complex tactile sensors for gesture interpretation, delivering a reproducible benchmark and actionable guidance for future pHRI research. Overall, the approach offers a practical, data-efficient path toward robust haptic gesture understanding in robotic systems and highlights avenues for robustness and active exploration in dynamic environments.

Abstract

Physical human-robot interactions (pHRI) are less efficient and communicative than human-human interactions, and a key reason is a lack of informative sense of touch in robotic systems. Interpreting human touch gestures is a nuanced, challenging task with extreme gaps between human and robot capability. Among prior works that demonstrate human touch recognition capability, differences in sensors, gesture classes, feature sets, and classification algorithms yield a conglomerate of non-transferable results and a glaring lack of a standard. To address this gap, this work presents 1) four proposed touch gesture classes that cover an important subset of the gesture characteristics identified in the literature, 2) the collection of an extensive force dataset on a common pHRI robotic arm with only its internal wrist force-torque sensor, and 3) an exhaustive performance comparison of combinations of feature sets and classification algorithms on this dataset. We demonstrate high classification accuracies among our proposed gesture definitions on a test set, emphasizing that neural net-work classifiers on the raw data outperform other combinations of feature sets and algorithms. The accompanying video is here: https://youtu.be/gJPVImNKU68

Towards Human Haptic Gesture Interpretation for Robotic Systems

TL;DR

The paper addresses the lack of standardized tactile sensing for natural human-robot interactions by proposing a four-gesture dictionary (Tap, Touch, Grab, Slip) and collecting a UR5e wrist force-torque dataset to benchmark multiple feature sets and classifiers. It compares six feature representations, including two autoencoder-based bottlenecks and three manual feature sets, across three classifier families, concluding that neural networks trained on raw force data achieve the best test accuracy (81%), with competitive results aligning with prior literature when gestures are mapped appropriately. The work demonstrates that simple, common force-torque sensing can rival more complex tactile sensors for gesture interpretation, delivering a reproducible benchmark and actionable guidance for future pHRI research. Overall, the approach offers a practical, data-efficient path toward robust haptic gesture understanding in robotic systems and highlights avenues for robustness and active exploration in dynamic environments.

Abstract

Physical human-robot interactions (pHRI) are less efficient and communicative than human-human interactions, and a key reason is a lack of informative sense of touch in robotic systems. Interpreting human touch gestures is a nuanced, challenging task with extreme gaps between human and robot capability. Among prior works that demonstrate human touch recognition capability, differences in sensors, gesture classes, feature sets, and classification algorithms yield a conglomerate of non-transferable results and a glaring lack of a standard. To address this gap, this work presents 1) four proposed touch gesture classes that cover an important subset of the gesture characteristics identified in the literature, 2) the collection of an extensive force dataset on a common pHRI robotic arm with only its internal wrist force-torque sensor, and 3) an exhaustive performance comparison of combinations of feature sets and classification algorithms on this dataset. We demonstrate high classification accuracies among our proposed gesture definitions on a test set, emphasizing that neural net-work classifiers on the raw data outperform other combinations of feature sets and algorithms. The accompanying video is here: https://youtu.be/gJPVImNKU68

Paper Structure

This paper contains 28 sections, 1 equation, 8 figures, 4 tables.

Figures (8)

  • Figure 1: This work aims to build a competent gestural interpretation layer, as illustrated above in relation to low-level force-torque sensor data and to high-level behavioral system decisions. The above exemplifies the utility of this layer for the purpose of a robot-to-human object hand-off.
  • Figure 2: An example of a user initiating a grab gesture to the robotic arm's end effector during the data collection experiments. Note that this visual depicts the scenario in which the end effector is removed and contact occurs at the robotic arm's distal plate. Half of the dataset involved contact with the robot through an installed end effector.
  • Figure 3: Six feature sets and three classification algorithms were tested, for a total of 11 models. The above shows the feature sets' relationships, relative dimensionality, and algorithm(s) employed on them.
  • Figure 4: Groupings of each gesture type of similar duration as they appear through force sensor readings. All x and y axis scales are shared.
  • Figure 5: Force data (top) produces a spectrogram (center) that displays more energetic frequency content at low frequencies during touch events than during periods of no human contact. This is captured by a linear approximation for a time slice of the frequency content. Two example slice profiles are illustrated (bottom), with their corresponding time stamps marked in the upper two plots.
  • ...and 3 more figures