Leveraging Haptic Feedback to Improve Data Quality and Quantity for Deep Imitation Learning Models

Catie Cuan; Allison Okamura; Mohi Khansari

Leveraging Haptic Feedback to Improve Data Quality and Quantity for Deep Imitation Learning Models

Catie Cuan, Allison Okamura, Mohi Khansari

TL;DR

The paper investigates whether real-time haptic feedback to human demonstrators during teleoperation improves data quality and quantity for deep visual imitation learning and whether these improvements translate to better autonomous policies. Using a two-phase study on latch-door opening, it shows that haptic feedback increases data throughput and the proportion of high-quality demonstrations, and that policies trained on haptic data achieve an overall 11% performance gain on real doors, with larger gains on more challenging left-swing tasks. The approach combines real-world data collection with phase-appropriate data curation, a ResNet-18 visual encoder, and simulation-to-real evaluation via RetinaGAN, without altering the model architecture. The findings highlight the practical value of low-cost haptic augmentation for improving imitation-learning pipelines in real-world robotic manipulation tasks.

Abstract

Learning from demonstration is a proven technique to teach robots new skills. Data quality and quantity play a critical role in the performance of models trained using data collected from human demonstrations. In this paper we enhance an existing teleoperation data collection system with real-time haptic feedback to the human demonstrators; we observe improvements in the collected data throughput and in the performance of autonomous policies using models trained with the data. Our experimental testbed was a mobile manipulator robot that opened doors with latch handles. Evaluation of teleoperated data collection on eight real conference room doors found that adding haptic feedback improved data throughput by 6%. We additionally used the collected data to train six image-based deep imitation learning models, three with haptic feedback and three without it. These models were used to implement autonomous door-opening with the same type of robot used during data collection. A policy from a imitation learning model trained with data collected while the human demonstrators received haptic feedback performed on average 11% better than its counterpart trained with data collected without haptic feedback, indicating that haptic feedback provided during data collection resulted in improved autonomous policies.

Leveraging Haptic Feedback to Improve Data Quality and Quantity for Deep Imitation Learning Models

TL;DR

Abstract

Paper Structure (13 sections, 7 figures, 1 algorithm)

This paper contains 13 sections, 7 figures, 1 algorithm.

INTRODUCTION
Prior Work
Methods
Phase 1: Studying haptic feedback during demonstrator performance
Real-world data collection
Data curation and model training
Phase 2: Robot performance evaluation
Model description and checkpoint selection
Policy testing on real doors
Results
Phase 1: Impact of haptic feedback on demonstrator performance
Phase 2: Impact of Haptic and Non-Haptic datasets on autonomous robot performance
Discussion and Conclusion

Figures (7)

Figure 1: We measured how haptic feedback altered task performance in terms of data throughput and data quality in a latch door opening task, and evaluated the effect of the resultant datasets on the autonomous performance of learned policies. Left: A demonstrator teleoperated a prototype mobile manipulator robot. Right: Close-up progression.
Figure 2: Our complete workflow including data collection, curation, training, and policy testing. This paper contributes to the first, second, and fourth steps by introducing haptic feedback to the teleoperator, curating the collected data, and evaluating different policies in the real world. For the third step, we followed the same approach as in khansari2022practical. For the models in the policy testing phase, matching colors indicate that they were tested on the same set of real world doors.
Figure 3: Nine demonstrators represented 100% of the original data in the "Full data" bars. Three demonstrators were removed due to non-compliance in the "Compliant participants only" bars. Failures were removed from remaining six demonstrators' data to create the "Successful examples only" subset. The remaining data was further curated by duration cutoff, and undesirable behavior removal resulting in the final "Curated successful examples". At left, a larger percentage of "Curated successful examples" remained in the Haptic condition, showing the improvement in data quantity. At right, demonstrators took less time to complete the task when receiving haptic feedback in all circumstances except when narrowing the data to "Curated successful examples".
Figure 4: A checkpoint was comprised of the exported trained weights of the model at a training step. Checkpoints were first evaluated in simulation (left), and then the top three best checkpoints were used for real world model evaluation (right).
Figure 5: Results of the NASA TLX survey conducted in Phase 1 (data collection).
...and 2 more figures

Leveraging Haptic Feedback to Improve Data Quality and Quantity for Deep Imitation Learning Models

TL;DR

Abstract

Leveraging Haptic Feedback to Improve Data Quality and Quantity for Deep Imitation Learning Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)