Table of Contents
Fetching ...

Efficient and Safe Contact-rich pHRI via Subtask Detection and Motion Estimation using Deep Learning

Pouya P. Niaz, Engin Erzin, Cagatay Basdogan

TL;DR

The paper tackles the challenge of efficient and safe contact-rich pHRI by introducing a two-layer ML framework that detects task subtasks and estimates motion progress to adapt the admittance controller in real time. It combines a subtask detector (LSTM) with a motion estimator (1D-CNN) to modulate damping across Idle, Tool-Attachment, Driving, and Contact phases, using $Y(s)=\frac{V_{ref}(s)}{F_{int}(s)}=\frac{1}{m s + b}$ with $m=50\,\text{kg}$ and phase-dependent $b$ values. Empirical results show subtask detection accuracy around 84% and motion-estimation $R^2$ around 0.95–0.96, with the C3 controller (subtask + motion estimation) achieving up to 57% lower human effort during Driving and 53% lower contact oscillations, both in VE and physical drilling. The Sim2Real validation demonstrates comparable performance in the real world, supporting the practical impact of the approach for safer and more efficient collaborative manufacturing. Future work points to unsupervised segmentation and RL-driven damping policies to further optimize phase-specific control.

Abstract

This paper proposes an adaptive admittance controller for improving efficiency and safety in physical human-robot interaction (pHRI) tasks in small-batch manufacturing that involve contact with stiff environments, such as drilling, polishing, cutting, etc. We aim to minimize human effort and task completion time while maximizing precision and stability during the contact of the machine tool attached to the robot's end-effector with the workpiece. To this end, a two-layered learning-based human intention recognition mechanism is proposed, utilizing only the kinematic and kinetic data from the robot and two force sensors. A ``subtask detector" recognizes the human intent by estimating which phase of the task is being performed, e.g., \textit{Idle}, \textit{Tool-Attachment}, \textit{Driving}, and \textit{Contact}. Simultaneously, a ``motion estimator" continuously quantifies intent more precisely during the \textit{Driving} to predict when \textit{Contact} will begin. The controller is adapted online according to the subtask while allowing early adaptation before the \textit{Contact} to maximize precision and safety and prevent potential instabilities. Three sets of pHRI experiments were performed with multiple subjects under various conditions. Spring compression experiments were performed in virtual environments to train the data-driven models and validate the proposed adaptive system, and drilling experiments were performed in the physical world to test the proposed methods' efficacy in real-life scenarios. Experimental results show subtask classification accuracy of 84\% and motion estimation R\textsuperscript{2} score of 0.96. Furthermore, 57\% lower human effort was achieved during \textit{Driving} as well as 53\% lower oscillation amplitude at \textit{Contact} as a result of the proposed system.

Efficient and Safe Contact-rich pHRI via Subtask Detection and Motion Estimation using Deep Learning

TL;DR

The paper tackles the challenge of efficient and safe contact-rich pHRI by introducing a two-layer ML framework that detects task subtasks and estimates motion progress to adapt the admittance controller in real time. It combines a subtask detector (LSTM) with a motion estimator (1D-CNN) to modulate damping across Idle, Tool-Attachment, Driving, and Contact phases, using with and phase-dependent values. Empirical results show subtask detection accuracy around 84% and motion-estimation around 0.95–0.96, with the C3 controller (subtask + motion estimation) achieving up to 57% lower human effort during Driving and 53% lower contact oscillations, both in VE and physical drilling. The Sim2Real validation demonstrates comparable performance in the real world, supporting the practical impact of the approach for safer and more efficient collaborative manufacturing. Future work points to unsupervised segmentation and RL-driven damping policies to further optimize phase-specific control.

Abstract

This paper proposes an adaptive admittance controller for improving efficiency and safety in physical human-robot interaction (pHRI) tasks in small-batch manufacturing that involve contact with stiff environments, such as drilling, polishing, cutting, etc. We aim to minimize human effort and task completion time while maximizing precision and stability during the contact of the machine tool attached to the robot's end-effector with the workpiece. To this end, a two-layered learning-based human intention recognition mechanism is proposed, utilizing only the kinematic and kinetic data from the robot and two force sensors. A ``subtask detector" recognizes the human intent by estimating which phase of the task is being performed, e.g., \textit{Idle}, \textit{Tool-Attachment}, \textit{Driving}, and \textit{Contact}. Simultaneously, a ``motion estimator" continuously quantifies intent more precisely during the \textit{Driving} to predict when \textit{Contact} will begin. The controller is adapted online according to the subtask while allowing early adaptation before the \textit{Contact} to maximize precision and safety and prevent potential instabilities. Three sets of pHRI experiments were performed with multiple subjects under various conditions. Spring compression experiments were performed in virtual environments to train the data-driven models and validate the proposed adaptive system, and drilling experiments were performed in the physical world to test the proposed methods' efficacy in real-life scenarios. Experimental results show subtask classification accuracy of 84\% and motion estimation R\textsuperscript{2} score of 0.96. Furthermore, 57\% lower human effort was achieved during \textit{Driving} as well as 53\% lower oscillation amplitude at \textit{Contact} as a result of the proposed system.
Paper Structure (32 sections, 5 equations, 16 figures, 3 tables)

This paper contains 32 sections, 5 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 1: Hardware setup in the virtual spring compression task (top) and the real drilling task (bottom).
  • Figure 2: The closed-loop control architecture used in our study. All variables are 3-dimensional vectors, corresponding to 3 translational degrees of freedom in the Cartesian space.
  • Figure 3: Time progress $\tau$, trajectory progress $\lambda$ and admittance damping $b$ vs. time $t$ across all subtasks. Damping is adapted at $t_d$ when Driving begins, and at $t_a$ when the estimated progress reaches the adaptation threshold, i.e., $\tau=\tau^*$, or $\lambda=\lambda^*$, just before Contact occurs at $t_c$. The trial ends when the drilling/spring compression reaches its desired depth of 4 [mm] at $t_f$.
  • Figure 4: Deep Learning architectures used in this study: (a) Subtask Detector LSTM, (b) Motion Estimator CNN. Numbers inside parentheses are hidden layer sizes for Linear layers, hidden sizes for LSTM layers, and the number of filters for Convolution layers, followed by filter sizes.
  • Figure 5: Snapshot of the visual feedback displayed to the subjects for the spring compression experiments performed in VEs; (a) What the subjects see on the monitor screen; (b) Another view to aid the imagination.
  • ...and 11 more figures