Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation

Bram Vanherle; Brent Zoomers; Jeroen Put; Frank Van Reeth; Nick Michiels

Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation

Bram Vanherle, Brent Zoomers, Jeroen Put, Frank Van Reeth, Nick Michiels

TL;DR

This work tackles the domain-gap challenge in synthetic data for instance segmentation by introducing Cut-and-Splat, a pipeline that uses Gaussian Splatting to render foreground objects from a short video into contextually plausible backgrounds without textured 3D models. Foreground extraction, depth-guided placement on plausible background surfaces, and lighting augmentations produce realistic training images with corresponding bounding boxes and masks. Empirical evaluation on the IBSYD dataset shows that Cut-and-Splat outperforms Cut-and-Paste and diffusion-based data generation, with ablations confirming the importance of smart placement and appearance augmentation. The method provides a practical, automated route to high-quality, object-specific synthetic data, reducing annotation cost and enabling domain-specific training, while future work could address relighting, transparency, and multi-pose scenarios.

Abstract

Generating synthetic images is a useful method for cheaply obtaining labeled data for training computer vision models. However, obtaining accurate 3D models of relevant objects is necessary, and the resulting images often have a gap in realism due to challenges in simulating lighting effects and camera artifacts. We propose using the novel view synthesis method called Gaussian Splatting to address these challenges. We have developed a synthetic data pipeline for generating high-quality context-aware instance segmentation training data for specific objects. This process is fully automated, requiring only a video of the target object. We train a Gaussian Splatting model of the target object and automatically extract the object from the video. Leveraging Gaussian Splatting, we then render the object on a random background image, and monocular depth estimation is employed to place the object in a believable pose. We introduce a novel dataset to validate our approach and show superior performance over other data generation approaches, such as Cut-and-Paste and Diffusion model-based generation.

Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation

TL;DR

Abstract

Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)