Table of Contents
Fetching ...

Feasibility-aware Imitation Learning from Observations through a Hand-mounted Demonstration Interface

Kei Takahashi, Hikaru Sasaki, Takamitsu Matsubara

TL;DR

This work introduces FABCO, a feasibility-aware imitation learning from observation framework that uses a hand-mounted demonstration interface and pre-trained forward and inverse dynamics models to assess and visualize demonstration feasibility. Demonstrations are guided by real-time feasibility feedback to improve their executability, and the learned policy is trained with feasibility-based weighting to enhance data efficiency and robustness. Empirical validation on a pipette insertion task with four participants shows that visual feasibility feedback and feasibility-weighted learning substantially improve task success rates, with NASA-TLX indicating manageable workload increases. The approach offers a practical pathway to reducing covariate shift in ILfO by coupling demonstrator guidance with feasibility-driven policy optimization.

Abstract

Imitation learning through a demonstration interface is expected to learn policies for robot automation from intuitive human demonstrations. However, due to the differences in human and robot movement characteristics, a human expert might unintentionally demonstrate an action that the robot cannot execute. We propose feasibility-aware behavior cloning from observation (FABCO). In the FABCO framework, the feasibility of each demonstration is assessed using the robot's pre-trained forward and inverse dynamics models. This feasibility information is provided as visual feedback to the demonstrators, encouraging them to refine their demonstrations. During policy learning, estimated feasibility serves as a weight for the demonstration data, improving both the data efficiency and the robustness of the learned policy. We experimentally validated FABCO's effectiveness by applying it to a pipette insertion task involving a pipette and a vial. Four participants assessed the impact of the feasibility feedback and the weighted policy learning in FABCO. Additionally, we used the NASA Task Load Index (NASA-TLX) to evaluate the workload induced by demonstrations with visual feedback.

Feasibility-aware Imitation Learning from Observations through a Hand-mounted Demonstration Interface

TL;DR

This work introduces FABCO, a feasibility-aware imitation learning from observation framework that uses a hand-mounted demonstration interface and pre-trained forward and inverse dynamics models to assess and visualize demonstration feasibility. Demonstrations are guided by real-time feasibility feedback to improve their executability, and the learned policy is trained with feasibility-based weighting to enhance data efficiency and robustness. Empirical validation on a pipette insertion task with four participants shows that visual feasibility feedback and feasibility-weighted learning substantially improve task success rates, with NASA-TLX indicating manageable workload increases. The approach offers a practical pathway to reducing covariate shift in ILfO by coupling demonstrator guidance with feasibility-driven policy optimization.

Abstract

Imitation learning through a demonstration interface is expected to learn policies for robot automation from intuitive human demonstrations. However, due to the differences in human and robot movement characteristics, a human expert might unintentionally demonstrate an action that the robot cannot execute. We propose feasibility-aware behavior cloning from observation (FABCO). In the FABCO framework, the feasibility of each demonstration is assessed using the robot's pre-trained forward and inverse dynamics models. This feasibility information is provided as visual feedback to the demonstrators, encouraging them to refine their demonstrations. During policy learning, estimated feasibility serves as a weight for the demonstration data, improving both the data efficiency and the robustness of the learned policy. We experimentally validated FABCO's effectiveness by applying it to a pipette insertion task involving a pipette and a vial. Four participants assessed the impact of the feasibility feedback and the weighted policy learning in FABCO. Additionally, we used the NASA Task Load Index (NASA-TLX) to evaluate the workload induced by demonstrations with visual feedback.

Paper Structure

This paper contains 23 sections, 5 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: Overview of FABCO: Demonstrator performs demonstrations using interface and receives visual feedback on feasibility of robot's execution. This encourages demonstrator to provide demonstrations with higher feasibility. Collected demonstration trajectory is weighted based on feasibility to learn a robust policy.
  • Figure 2:
  • Figure 3:
  • Figure 5: Visualization of feasibility feedback: Demonstration trajectory is color-coded based on feasibility and displayed with automatic rotation to encourage improvement in demonstrations.
  • Figure 6: Pipette insertion task environment
  • ...and 7 more figures