Learning Diverse Robot Striking Motions with Diffusion Models and Kinematically Constrained Gradient Guidance
Kin Man Lee, Sean Ye, Qingyu Xiao, Zixuan Wu, Zulfiqar Zaidi, David B. D'Ambrosio, Pannag R. Sanketi, Matthew Gombolay
TL;DR
This work tackles sample-inefficient, constraint-averse learning for agile robotic striking by introducing Kinematic Constraint Gradient Guidance (KCGG), an offline diffusion-based imitation-learning framework. KCGG computes gradients through both the forward kinematics $F(q)$ and the diffusion model to steer samples toward task constraints while preserving the training data distribution, enabling multimodal, high-speed motions from limited demonstrations. Empirical results in simulated AirHockey and real table-tennis setups show that KCGG outperforms baselines, achieving substantial gains in block/success rates and demonstrating robust constraint satisfaction even under tight timing constraints. The approach offers a practical route to deploy constrained, diverse robotic skills without requiring high-fidelity simulators, with broad applicability to dynamic manipulation tasks.
Abstract
Advances in robot learning have enabled robots to generate skills for a variety of tasks. Yet, robot learning is typically sample inefficient, struggles to learn from data sources exhibiting varied behaviors, and does not naturally incorporate constraints. These properties are critical for fast, agile tasks such as playing table tennis. Modern techniques for learning from demonstration improve sample efficiency and scale to diverse data, but are rarely evaluated on agile tasks. In the case of reinforcement learning, achieving good performance requires training on high-fidelity simulators. To overcome these limitations, we develop a novel diffusion modeling approach that is offline, constraint-guided, and expressive of diverse agile behaviors. The key to our approach is a kinematic constraint gradient guidance (KCGG) technique that computes gradients through both the forward kinematics of the robot arm and the diffusion model to direct the sampling process. KCGG minimizes the cost of violating constraints while simultaneously keeping the sampled trajectory in-distribution of the training data. We demonstrate the effectiveness of our approach for time-critical robotic tasks by evaluating KCGG in two challenging domains: simulated air hockey and real table tennis. In simulated air hockey, we achieved a 25.4% increase in block rate, while in table tennis, we saw a 17.3% increase in success rate compared to imitation learning baselines.
