Table of Contents
Fetching ...

Easy-IIL: Reducing Human Operational Burden in Interactive Imitation Learning via Assistant Experts

Chengjie Zhang, Chao Tang, Wenlong Dong, Dehao Huang, Aoxiang Gu, Hong Zhang

Abstract

Interactive Imitation Learning (IIL) typically relies on extensive human involvement for both offline demonstration and online interaction. Prior work primarily focuses on reducing human effort in passive monitoring rather than active operation. Interestingly, structured model-based imitation approaches achieve comparable performance with significantly fewer demonstrations than end-to-end imitation learning policies in the low-data regime. However, these methods are typically surpassed by end-to-end policies as the data increases. Leveraging this insight, we propose Easy-IIL, a framework that utilizes off-the-shelf model-based imitation methods as an assistant expert to replace active human operation for the majority of data collection. The human expert only provides a single demonstration to initialize the assistant expert and intervenes in critical states where the task is approaching failure. Furthermore, Easy-IIL can maintain IIL performance by preserving both offline and online data quality. Extensive simulation and real-world experiments demonstrate that Easy-IIL significantly reduces human operational burden while maintaining performance comparable to mainstream IIL baselines. User studies further confirm that Easy-IIL reduces subjective workload on the human expert. Project page: https://sites.google.com/view/easy-iil

Easy-IIL: Reducing Human Operational Burden in Interactive Imitation Learning via Assistant Experts

Abstract

Interactive Imitation Learning (IIL) typically relies on extensive human involvement for both offline demonstration and online interaction. Prior work primarily focuses on reducing human effort in passive monitoring rather than active operation. Interestingly, structured model-based imitation approaches achieve comparable performance with significantly fewer demonstrations than end-to-end imitation learning policies in the low-data regime. However, these methods are typically surpassed by end-to-end policies as the data increases. Leveraging this insight, we propose Easy-IIL, a framework that utilizes off-the-shelf model-based imitation methods as an assistant expert to replace active human operation for the majority of data collection. The human expert only provides a single demonstration to initialize the assistant expert and intervenes in critical states where the task is approaching failure. Furthermore, Easy-IIL can maintain IIL performance by preserving both offline and online data quality. Extensive simulation and real-world experiments demonstrate that Easy-IIL significantly reduces human operational burden while maintaining performance comparable to mainstream IIL baselines. User studies further confirm that Easy-IIL reduces subjective workload on the human expert. Project page: https://sites.google.com/view/easy-iil
Paper Structure (27 sections, 9 equations, 5 figures, 6 tables, 2 algorithms)

This paper contains 27 sections, 9 equations, 5 figures, 6 tables, 2 algorithms.

Figures (5)

  • Figure 1: (a) illustrates the standard IIL interaction mechanism, where a human expert is involved in both offline and online stages. (b) depicts Easy-IIL, which delegates most offline and online data collection and interactions to an assistant expert, substantially reducing active human operational effort.
  • Figure 2: Data collection strategy in Easy-IIL for (a) the first one offline demonstration and (b) the remaining offline demonstrations ("Offline" mode) and online interaction data ("Online" mode). Specifically, $a^H=\pi^H(s)$, $a^R=\pi^R(s)$, $a^A=\pi^A(o)$, $a^N=\pi^N(s)$. $s$ and $o$ represent state and observation, respectively.
  • Figure 3: Four task scenarios. Basketball in Hoop (left-top), Take Chicken in Saucepan (right-top), Hang the Cup (left-bottom), Put Duck in Cooker (right-bottom).
  • Figure 4: Main experiment results evaluating Succ. Rate (a) and Intv. Rate (b) across four tasks for Easy-IIL, HG-DAgger, IWR, and Sirius. Easy-IIL is configured with $H=8$ and $\sigma=0.3$.
  • Figure 5: Ablation studies results evaluating (a) Succ. Rate and (b) Intv. Rate across two simulated tasks for online strategy of Easy-IIL. Easy-IIL is configured with $H=8$ and $\sigma=0.3$.