ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers
Liangliang Chen, Shiyu Jin, Haoyu Wang, Liangjun Zhang
TL;DR
ExACT tackles the challenge of end-to-end autonomous excavation by learning a controller that maps raw LiDAR, camera, and joint-position observations directly to valve commands using Action Chunking with Transformers (ACT). The system combines imitation learning with temporal ensembling and a conditional variational autoencoder to generate smooth action sequences, learned from a small set of human demonstrations and validated in a simulator built from real-world data. Results show successful execution of reach, dig_dump, and dig_dump_return tasks in the simulator, with limitations in high-frequency valve dynamics for digging that suggest the need for more demonstrations or higher control bandwidth. This work constitutes a first demonstration of end-to-end imitation-learning-based autonomous excavation, offering a path toward real-world deployment and enhanced safety in construction and mining settings.
Abstract
Excavators are crucial for diverse tasks such as construction and mining, while autonomous excavator systems enhance safety and efficiency, address labor shortages, and improve human working conditions. Different from the existing modularized approaches, this paper introduces ExACT, an end-to-end autonomous excavator system that processes raw LiDAR, camera data, and joint positions to control excavator valves directly. Utilizing the Action Chunking with Transformers (ACT) architecture, ExACT employs imitation learning to take observations from multi-modal sensors as inputs and generate actionable sequences. In our experiment, we build a simulator based on the captured real-world data to model the relations between excavator valve states and joint velocities. With a few human-operated demonstration data trajectories, ExACT demonstrates the capability of completing different excavation tasks, including reaching, digging and dumping through imitation learning in validations with the simulator. To the best of our knowledge, ExACT represents the first instance towards building an end-to-end autonomous excavator system via imitation learning methods with a minimal set of human demonstrations. The video about this work can be accessed at https://youtu.be/NmzR_Rf-aEk.
