GRITS: A Spillage-Aware Guided Diffusion Policy for Robot Food Scooping Tasks
Yen-Ling Tai, Yi-Ru Yang, Kuan-Ting Yu, Yu-Wei Chao, Yi-Ting Chen
TL;DR
GRITS addresses spillage in robotic food scooping by integrating a spillage predictor into a guided diffusion policy, enabling test-time trajectory refinement with a differentiable safety objective. The spillage predictor is trained in simulation over four primitive shapes with varied properties, and operates on segmented point clouds processed by a DP3 encoder to bridge sim-to-real gaps. Inference uses a gradient-based guidance mechanism with a defined objective $J$ based on $P_{\text{spillage}}$, and a carefully chosen guidance weight $\\rho$ and scheduling to balance safety and task success. Real-world experiments across six training foods and ten unseen categories achieve 82% task success and 4% spillage, reducing spillage by over 40% relative to baselines, demonstrating robust generalization and practical viability.
Abstract
Robotic food scooping is a critical manipulation skill for food preparation and service robots. However, existing robot learning algorithms, especially learn-from-demonstration methods, still struggle to handle diverse and dynamic food states, which often results in spillage and reduced reliability. In this work, we introduce GRITS: A Spillage-Aware Guided Diffusion Policy for Robot Food Scooping Tasks. This framework leverages guided diffusion policy to minimize food spillage during scooping and to ensure reliable transfer of food items from the initial to the target location. Specifically, we design a spillage predictor that estimates the probability of spillage given current observation and action rollout. The predictor is trained on a simulated dataset with food spillage scenarios, constructed from four primitive shapes (spheres, cubes, cones, and cylinders) with varied physical properties such as mass, friction, and particle size. At inference time, the predictor serves as a differentiable guidance signal, steering the diffusion sampling process toward safer trajectories while preserving task success. We validate GRITS on a real-world robotic food scooping platform. GRITS is trained on six food categories and evaluated on ten unseen categories with different shapes and quantities. GRITS achieves an 82% task success rate and a 4% spillage rate, reducing spillage by over 40% compared to baselines without guidance, thereby demonstrating its effectiveness.
