PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation
Lukas Meyer, Floris Erich, Yusuke Yoshiyasu, Marc Stamminger, Noriaki Ando, Yukiyasu Domae
TL;DR
PEGASUS addresses the realism gap in synthetic 6DoF pose data by combining environment and object reconstructions with 3D Gaussian Splatting and physics-based placement to generate diverse static and dynamic scenes. It renders RGB, depth, semantic masks, and precise 6DoF poses in BOP format, enabling effective training and transfer of pose-estimation networks like DOPE to real imagery. The authors introduce the Ramen dataset and PEGASET to demonstrate scalability with scanned environments and objects, and show that networks trained on PEGASUS data can perform real-to-synthetic transfer in robotic grasp tasks with UR5. Overall, PEGASUS provides a modular framework for domain-specific dataset generation that can be extended with additional environments, diffusion-based augmentations, and LIDAR-informed 3DGS to further close the reality gap.
Abstract
We introduce Physically Enhanced Gaussian Splatting Simulation System (PEGASUS) for 6DOF object pose dataset generation, a versatile dataset generator based on 3D Gaussian Splatting. Environment and object representations can be easily obtained using commodity cameras to reconstruct with Gaussian Splatting. <i>PEGASUS</i> allows the composition of new scenes by merging the respective underlying Gaussian Splatting point cloud of an environment with one or multiple objects. Leveraging a physics engine enables the simulation of natural object placement within a scene through interaction between meshes extracted for the objects and the environment. Consequently, an extensive amount of new scenes - static or dynamic - can be created by combining different environments and objects. By rendering scenes from various perspectives, diverse data points such as RGB images, depth maps, semantic masks, and 6DoF object poses can be extracted. Our study demonstrates that training on data generated by PEGASUS enables pose estimation networks to successfully transfer from synthetic data to real-world data. Moreover, we introduce the Ramen dataset, comprising 30 Japanese cup noodle items. This dataset includes spherical scans that captures images from both object hemisphere and the Gaussian Splatting reconstruction, making them compatible with PEGASUS.
