MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts
Xiatao Sun, Shuo Yang, Mingyan Zhou, Kunpeng Liu, Rahul Mangharam
TL;DR
This work addresses imitation learning when multiple imperfect experts are available, a scenario common in safety-critical autonomous systems. It introduces MEGA-DAgger, a three-component extension of HG-DAgger that (i) uses a Control Barrier Function-based data filter to prune unsafe demonstrations, (ii) selects a dominant expert per iteration, and (iii) resolves label conflicts across experts through cosine-similarity-based matching and a combined score $\omega_t$ that balances safety and progress. Empirical results in autonomous racing on the F1TENTH platform show that MEGA-DAgger achieves a better-than-expert policy, outperforming both individual experts and HG-DAgger in overtakes and collision avoidance, and remains effective in real-world hardware experiments. The approach offers a practical, data-efficient path to leveraging diverse experts in multi-agent, safety-critical domains and can be adapted to general autonomous systems with task-specific safety and progress metrics.
Abstract
Imitation learning has been widely applied to various autonomous systems thanks to recent development in interactive algorithms that address covariate shift and compounding errors induced by traditional approaches like behavior cloning. However, existing interactive imitation learning methods assume access to one perfect expert. Whereas in reality, it is more likely to have multiple imperfect experts instead. In this paper, we propose MEGA-DAgger, a new DAgger variant that is suitable for interactive learning with multiple imperfect experts. First, unsafe demonstrations are filtered while aggregating the training data, so the imperfect demonstrations have little influence when training the novice policy. Next, experts are evaluated and compared on scenarios-specific metrics to resolve the conflicted labels among experts. Through experiments in autonomous racing scenarios, we demonstrate that policy learned using MEGA-DAgger can outperform both experts and policies learned using the state-of-the-art interactive imitation learning algorithms such as Human-Gated DAgger. The supplementary video can be found at \url{https://youtu.be/wPCht31MHrw}.
