SCOOP'D: Learning Mixed-Liquid-Solid Scooping via Sim2Real Generative Policy
Kuanning Wang, Yongchong Gu, Yuqian Fu, Zeyu Shangguan, Sicheng He, Xiangyang Xue, Yanwei Fu, Daniel Seita
TL;DR
Robotic scooping of mixtures containing liquids and solids is challenging due to complex tool-object interactions and deformable dynamics. The authors propose SCOOP'D, a Sim2Real framework that learns from simulation with privileged state information in OmniGibson to train two diffusion-policy models, $f_\phi$ for pre-scoop pose and $\pi_\theta$ for scooping motions, plus a geometry network $g_\psi$ and perception modules for real-time object pose estimation. A large synthetic dataset, SimScoop, contains 6,480 demonstrations, enabling zero-shot real-world deployment that generalizes across objects, liquids, occlusions, and containers, validated over hundreds of trials. This approach offers scalable, safe, and broadly applicable robotic scooping for assistive, cooking, and environmental-cleanup tasks without requiring real-world fine-tuning.
Abstract
Scooping items with tools such as spoons and ladles is common in daily life, ranging from assistive feeding to retrieving items from environmental disaster sites. However, developing a general and autonomous robotic scooping policy is challenging since it requires reasoning about complex tool-object interactions. Furthermore, scooping often involves manipulating deformable objects, such as granular media or liquids, which is challenging due to their infinite-dimensional configuration spaces and complex dynamics. We propose a method, SCOOP'D, which uses simulation from OmniGibson (built on NVIDIA Omniverse) to collect scooping demonstrations using algorithmic procedures that rely on privileged state information. Then, we use generative policies via diffusion to imitate demonstrations from observational input. We directly apply the learned policy in diverse real-world scenarios, testing its performance on various item quantities, item characteristics, and container types. In zero-shot deployment, our method demonstrates promising results across 465 trials in diverse scenarios, including objects of different difficulty levels that we categorize as "Level 1" and "Level 2." SCOOP'D outperforms all baselines and ablations, suggesting that this is a promising approach to acquiring robotic scooping skills. Project page is at https://scoopdiff.github.io/.
