Distilling Contact Planning for Fast Trajectory Optimization in Robot Air Hockey
Julius Jankowski, Ante Marić, Puze Liu, Davide Tateo, Jan Peters, Sylvain Calinon
TL;DR
This work tackles the challenge of fast, contact-rich planning in dynamic robot tasks by coupling a learned stochastic contact model with a distilled, implicit policy and a model-predictive controller. The puck dynamics are modeled as a mixture of linear-Gaussian modes (Floating, Puck-Wall, Puck-Mallet), learned from data and updated via a piecewise-linear Kalman filter to enable online state estimation and goal probability assessment. Shooting planning is cast as a chance-constrained stochastic optimal control problem, with a reduced action space via a shooting angle and an energy-based model that enables real-time, multimodal decision making through offline data and online sampling with warm-starting. The approach demonstrates superior performance over control-based and learning-based baselines in both simulation and real-world air hockey, with robust sim-to-real transfer and controllable behavior through objective weights and constraints. Still, limitations include the need for low-dimensional task spaces and reliance on priors, suggesting future work on higher-dimensional spaces and integrating additional priors for data-efficient learning.
Abstract
Robot control through contact is challenging as it requires reasoning over long horizons and discontinuous system dynamics. Highly dynamic tasks such as Air Hockey additionally require agile behavior, making the corresponding optimal control problems intractable for planning in realtime. Learning-based approaches address this issue by shifting computationally expensive reasoning through contacts to an offline learning phase. However, learning low-level motor policies subject to kinematic and dynamic constraints can be challenging if operating in proximity to such constraints is desired. This paper explores the combination of distilling a stochastic optimal control policy for high-level contact planning and online model-predictive control for low-level constrained motion planning. Our system learns to balance shooting accuracy and resulting puck speed by leveraging bank shots and the robot's kinematic structure. We show that the proposed framework outperforms purely control-based and purely learning-based techniques in both simulated and real-world games of Robot Air Hockey.
