Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Jianxin Bi, Kelvin Lim, Kaiqi Chen, Yifei Huang, Harold Soh
TL;DR
Problem: data-efficient imitation learning from state observations under limited action labels. Approach: KOAP combines a diffusion-planner for future-state planning with a Deep Koopman Operator to lift dynamics via observables $g_ heta(x)$ into a linear latent space governed by a learned matrix $\mathcal{K}_\theta$ and a latent-action predictor $f_ heta$, then maps to real actions with a linear decoder $d_phi$. Key contributions: regularized latent-action learning via linear forward dynamics, effective action prediction with minimal $\mathcal{D}_a$ supervision, and strong performance on the D3IL benchmark plus a real-robot scooping case. Significance: enables scalable imitation from observation by reducing the need for action labeling while supporting continuous-action policies in robotics.
Abstract
Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers by leveraging observational demonstration data. Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation. This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder, requiring minimal action-labeled data. Through experiments on simulated robot manipulation tasks and a real robot experiment with multi-modal expert demonstrations, we demonstrate that our approach significantly enhances action-data efficiency and achieves high task success rates with limited action data.
