Table of Contents
Fetching ...

Simultaneous Estimation of Manipulation Skill and Hand Grasp Force from Forearm Ultrasound Images

Keshav Bimbraw, Srikar Nekkanti, Daniel B. Tiller, Mihir Deshmukh, Berk Calli, Robert D. Howe, Haichong K. Zhang

TL;DR

This work presents a CNN-based framework to simultaneously classify manipulation skills and estimate hand grasp force from forearm B-mode ultrasound, enabling real-time teleoperation and learning-from-demonstration. The model achieves a cross-validated skill-classification accuracy of $94.9\% \pm 10.2\%$ and a force RMSE of $0.51\pm 0.19\,\mathrm{N}$ across seven subjects and five tasks, with inference times around $7$ ms per task. Grad-CAM-based interpretability reveals activation of key forearm muscles (e.g., FDP, FPL, FDS) and highlights subject- and task-dependent artifacts, informing robustness considerations. The results demonstrate the feasibility of ultrasound-based sensing for trustworthy human–machine interfacing and provide a foundation for improved human-robot skill transfer and telemanipulation in diverse environments, while outlining data-diversity and real-world validation directions for future work.

Abstract

Accurate estimation of human hand configuration and the forces they exert is critical for effective teleoperation and skill transfer in robotic manipulation. A deeper understanding of human interactions with objects can further enhance teleoperation performance. To address this need, researchers have explored methods to capture and translate human manipulation skills and applied forces to robotic systems. Among these, biosignal-based approaches, particularly those using forearm ultrasound data, have shown significant potential for estimating hand movements and finger forces. In this study, we present a method for simultaneously estimating manipulation skills and applied hand force using forearm ultrasound data. Data collected from seven participants were used to train deep learning models for classifying manipulation skills and estimating grasp force. Our models achieved an average classification accuracy of 94.87 percent plus or minus 10.16 percent for manipulation skills and an average root mean square error (RMSE) of 0.51 plus or minus 0.19 Newtons for force estimation, as evaluated using five-fold cross-validation. These results highlight the effectiveness of forearm ultrasound in advancing human-machine interfacing and robotic teleoperation for complex manipulation tasks. This work enables new and effective possibilities for human-robot skill transfer and tele-manipulation, bridging the gap between human dexterity and robotic control.

Simultaneous Estimation of Manipulation Skill and Hand Grasp Force from Forearm Ultrasound Images

TL;DR

This work presents a CNN-based framework to simultaneously classify manipulation skills and estimate hand grasp force from forearm B-mode ultrasound, enabling real-time teleoperation and learning-from-demonstration. The model achieves a cross-validated skill-classification accuracy of and a force RMSE of across seven subjects and five tasks, with inference times around ms per task. Grad-CAM-based interpretability reveals activation of key forearm muscles (e.g., FDP, FPL, FDS) and highlights subject- and task-dependent artifacts, informing robustness considerations. The results demonstrate the feasibility of ultrasound-based sensing for trustworthy human–machine interfacing and provide a foundation for improved human-robot skill transfer and telemanipulation in diverse environments, while outlining data-diversity and real-world validation directions for future work.

Abstract

Accurate estimation of human hand configuration and the forces they exert is critical for effective teleoperation and skill transfer in robotic manipulation. A deeper understanding of human interactions with objects can further enhance teleoperation performance. To address this need, researchers have explored methods to capture and translate human manipulation skills and applied forces to robotic systems. Among these, biosignal-based approaches, particularly those using forearm ultrasound data, have shown significant potential for estimating hand movements and finger forces. In this study, we present a method for simultaneously estimating manipulation skills and applied hand force using forearm ultrasound data. Data collected from seven participants were used to train deep learning models for classifying manipulation skills and estimating grasp force. Our models achieved an average classification accuracy of 94.87 percent plus or minus 10.16 percent for manipulation skills and an average root mean square error (RMSE) of 0.51 plus or minus 0.19 Newtons for force estimation, as evaluated using five-fold cross-validation. These results highlight the effectiveness of forearm ultrasound in advancing human-machine interfacing and robotic teleoperation for complex manipulation tasks. This work enables new and effective possibilities for human-robot skill transfer and tele-manipulation, bridging the gap between human dexterity and robotic control.

Paper Structure

This paper contains 42 sections, 14 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: The pipeline for simultaneous manipulation skill classification and continuous force estimation using ultrasound data and deep learning models. (a) A subject performing manipulation skill 1 and applying force to a ball. (b) Forearm ultrasound image highlighting muscle activity. (c) Real-time force and skill estimation compared to ground truth. (d) Pipeline architecture showing the integrated skill classification and force estimation for robotic control.
  • Figure 2: (Top row) Manipulation skill prototypes and objects: (a) Push to horizontal - ball, (b) Push to vertical - cylindrical can, (c) Slide to edge - plate, (d) Flip - thin cuboid, (e) Simple pick (Push grasp) - mug. (Bottom row) Robot executing the corresponding manipulation skills (f - j).
  • Figure 3: Different muscles and their activation: (a) Flexor digitorum superficialis (FDS), flexor carpi ulnaris (FCU), flexor digitorum profundus (FDP) and flexor pollicis longus (FPL). The changes in the muscle activations visualized using Grad-CAM for manipulation skill 2 in (b) and (c). The bright areas correspond to the regions of the image contribute to the CNN's continuous force estimation.
  • Figure 4: (a) Hardware setup for data acquisition: System for acquiring ultrasound and force data. B-mode ultrasound data acquired using a Sonostar probe and 3 FlexiForce sensors used for the thumb, index and middle fingers. An Arduino Uno is used to interface the force sensors with the system, and the data from the probe is transmitted over Wi-Fi. (b) Convolutional Neural Network (CNN) architecture for skill classification and force estimation: The CNN architecture is the same for both tasks, except for the fully connected layer with softmax activation and 5 parameters. For the force estimation task, the fully connected layer has a linear activation and 1 output parameter.
  • Figure 5: Comparison of fold-wise and subject-wise classification results.
  • ...and 5 more figures