IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation
Krishan Rana, Robert Lee, David Pershouse, Niko Suenderhauf
TL;DR
IMLE Policy addresses the challenge of learning expressive, multi-modal visuomotor policies from limited demonstrations while enabling fast, single-step inference. By extending conditional RS-IMLE to behaviour cloning, it preserves mode coverage without mode collapse and uses rejection sampling to maintain data efficiency. Across eight simulation benchmarks and two real-world tasks, it achieves up to $38\%$ data savings and up to $97.3\%$ faster inference compared with diffusion-based baselines, while maintaining competitive performance in rich multi-modal settings. The work demonstrates that a simple generator-based architecture coupled with temporal consistency can deliver practical, robust robotic policies suitable for real-time deployment and open avenues for RL fine-tuning and MPC applications.
Abstract
Recent advances in imitation learning, particularly using generative modelling techniques like diffusion, have enabled policies to capture complex multi-modal action distributions. However, these methods often require large datasets and multiple inference steps for action generation, posing challenges in robotics where the cost for data collection is high and computation resources are limited. To address this, we introduce IMLE Policy, a novel behaviour cloning approach based on Implicit Maximum Likelihood Estimation (IMLE). IMLE Policy excels in low-data regimes, effectively learning from minimal demonstrations and requiring 38\% less data on average to match the performance of baseline methods in learning complex multi-modal behaviours. Its simple generator-based architecture enables single-step action generation, improving inference speed by 97.3\% compared to Diffusion Policy, while outperforming single-step Flow Matching. We validate our approach across diverse manipulation tasks in simulated and real-world environments, showcasing its ability to capture complex behaviours under data constraints. Videos and code are provided on our project page: https://imle-policy.github.io/.
