ComGS: Efficient 3D Object-Scene Composition via Surface Octahedral Probes

Jian Gao; Mengqi Yuan; Yifei Zeng; Chang Zeng; Zhihao Li; Zhenyu Chen; Weichao Qiu; Xiao-Xiao Long; Hao Zhu; Xun Cao; Yao Yao

ComGS: Efficient 3D Object-Scene Composition via Surface Octahedral Probes

Jian Gao, Mengqi Yuan, Yifei Zeng, Chang Zeng, Zhihao Li, Zhenyu Chen, Weichao Qiu, Xiao-Xiao Long, Hao Zhu, Xun Cao, Yao Yao

TL;DR

This paper tackles realistic 3D object–scene composition in Gaussian Splatting by separating relightable object reconstruction from scene lighting estimation. It introduces Surface Octahedral Probes (SOPs) to store indirect lighting and occlusion, enabling fast, interpolation-based shading without per-iteration ray tracing. Lighting estimation is simplified to environment-map completion at the object placement site using a 360° radiance sweep and a fine-tuned diffusion model, producing coherent shadows in complex scenes. The ComGS framework delivers around 28 FPS rendering with ~36 seconds of editing, validated on SynCom and real-world captures, and achieves higher harmony and visual realism than prior approaches while substantially improving reconstruction efficiency. These advances bring practical, immersive 3D object insertion closer to real-time usage in complex environments.

Abstract

Gaussian Splatting (GS) enables immersive rendering, but realistic 3D object-scene composition remains challenging. Baked appearance and shadow information in GS radiance fields cause inconsistencies when combining objects and scenes. Addressing this requires relightable object reconstruction and scene lighting estimation. For relightable object reconstruction, existing Gaussian-based inverse rendering methods often rely on ray tracing, leading to low efficiency. We introduce Surface Octahedral Probes (SOPs), which store lighting and occlusion information and allow efficient 3D querying via interpolation, avoiding expensive ray tracing. SOPs provide at least a 2x speedup in reconstruction and enable real-time shadow computation in Gaussian scenes. For lighting estimation, existing Gaussian-based inverse rendering methods struggle to model intricate light transport and often fail in complex scenes, while learning-based methods predict lighting from a single image and are viewpoint-sensitive. We observe that 3D object-scene composition primarily concerns the object's appearance and nearby shadows. Thus, we simplify the challenging task of full scene lighting estimation by focusing on the environment lighting at the object's placement. Specifically, we capture a 360 degrees reconstructed radiance field of the scene at the location and fine-tune a diffusion model to complete the lighting. Building on these advances, we propose ComGS, a novel 3D object-scene composition framework. Our method achieves high-quality, real-time rendering at around 28 FPS, produces visually harmonious results with vivid shadows, and requires only 36 seconds for editing. Code and dataset are available at https://nju-3dv.github.io/projects/ComGS/.

ComGS: Efficient 3D Object-Scene Composition via Surface Octahedral Probes

TL;DR

Abstract

ComGS: Efficient 3D Object-Scene Composition via Surface Octahedral Probes

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)