Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
Nan Wang, Yuantao Chen, Lixing Xiao, Weiqing Xiao, Bohan Li, Zhaoxi Chen, Chongjie Ye, Shaocong Xu, Saining Zhang, Ziyang Yan, Pierre Merriaux, Lei Lei, Tianfan Xue, Hao Zhao
TL;DR
This work tackles photometric inconsistency in driving-scene neural rendering (NeRF/GS) by unifying global appearance codes with pixel-wise bilateral grids via a three-level multi-scale architecture. It introduces a joint optimization framework with a photometric enhancement operator $\mathcal{E}(\cdot)$ and a Gaussian Splatting-based scene graph that handles static, sky, and dynamic elements, including deformable agents through a dedicated deformation network. The core contribution is a coarse-to-fine, scale-aware transformation $\bar{A}$ learned through a three-level bilateral grid, enabling patch-wise corrections while preserving global consistency. Experimental results across Waymo, NuScenes, Argoverse, and PandaSet demonstrate substantial geometric improvements (lower Chamfer Distance, RMSE, and depth error) and competitive appearance quality, indicating strong potential for robust autonomous driving perception and planning.
Abstract
Neural rendering techniques, including NeRF and Gaussian Splatting (GS), rely on photometric consistency to produce high-quality reconstructions. However, in real-world scenarios, it is challenging to guarantee perfect photometric consistency in acquired images. Appearance codes have been widely used to address this issue, but their modeling capability is limited, as a single code is applied to the entire image. Recently, the bilateral grid was introduced to perform pixel-wise color mapping, but it is difficult to optimize and constrain effectively. In this paper, we propose a novel multi-scale bilateral grid that unifies appearance codes and bilateral grids. We demonstrate that this approach significantly improves geometric accuracy in dynamic, decoupled autonomous driving scene reconstruction, outperforming both appearance codes and bilateral grids. This is crucial for autonomous driving, where accurate geometry is important for obstacle avoidance and control. Our method shows strong results across four datasets: Waymo, NuScenes, Argoverse, and PandaSet. We further demonstrate that the improvement in geometry is driven by the multi-scale bilateral grid, which effectively reduces floaters caused by photometric inconsistency.
