Table of Contents
Fetching ...

Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting

Nan Wang, Yuantao Chen, Lixing Xiao, Weiqing Xiao, Bohan Li, Zhaoxi Chen, Chongjie Ye, Shaocong Xu, Saining Zhang, Ziyang Yan, Pierre Merriaux, Lei Lei, Tianfan Xue, Hao Zhao

TL;DR

This work tackles photometric inconsistency in driving-scene neural rendering (NeRF/GS) by unifying global appearance codes with pixel-wise bilateral grids via a three-level multi-scale architecture. It introduces a joint optimization framework with a photometric enhancement operator $\mathcal{E}(\cdot)$ and a Gaussian Splatting-based scene graph that handles static, sky, and dynamic elements, including deformable agents through a dedicated deformation network. The core contribution is a coarse-to-fine, scale-aware transformation $\bar{A}$ learned through a three-level bilateral grid, enabling patch-wise corrections while preserving global consistency. Experimental results across Waymo, NuScenes, Argoverse, and PandaSet demonstrate substantial geometric improvements (lower Chamfer Distance, RMSE, and depth error) and competitive appearance quality, indicating strong potential for robust autonomous driving perception and planning.

Abstract

Neural rendering techniques, including NeRF and Gaussian Splatting (GS), rely on photometric consistency to produce high-quality reconstructions. However, in real-world scenarios, it is challenging to guarantee perfect photometric consistency in acquired images. Appearance codes have been widely used to address this issue, but their modeling capability is limited, as a single code is applied to the entire image. Recently, the bilateral grid was introduced to perform pixel-wise color mapping, but it is difficult to optimize and constrain effectively. In this paper, we propose a novel multi-scale bilateral grid that unifies appearance codes and bilateral grids. We demonstrate that this approach significantly improves geometric accuracy in dynamic, decoupled autonomous driving scene reconstruction, outperforming both appearance codes and bilateral grids. This is crucial for autonomous driving, where accurate geometry is important for obstacle avoidance and control. Our method shows strong results across four datasets: Waymo, NuScenes, Argoverse, and PandaSet. We further demonstrate that the improvement in geometry is driven by the multi-scale bilateral grid, which effectively reduces floaters caused by photometric inconsistency.

Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting

TL;DR

This work tackles photometric inconsistency in driving-scene neural rendering (NeRF/GS) by unifying global appearance codes with pixel-wise bilateral grids via a three-level multi-scale architecture. It introduces a joint optimization framework with a photometric enhancement operator and a Gaussian Splatting-based scene graph that handles static, sky, and dynamic elements, including deformable agents through a dedicated deformation network. The core contribution is a coarse-to-fine, scale-aware transformation learned through a three-level bilateral grid, enabling patch-wise corrections while preserving global consistency. Experimental results across Waymo, NuScenes, Argoverse, and PandaSet demonstrate substantial geometric improvements (lower Chamfer Distance, RMSE, and depth error) and competitive appearance quality, indicating strong potential for robust autonomous driving perception and planning.

Abstract

Neural rendering techniques, including NeRF and Gaussian Splatting (GS), rely on photometric consistency to produce high-quality reconstructions. However, in real-world scenarios, it is challenging to guarantee perfect photometric consistency in acquired images. Appearance codes have been widely used to address this issue, but their modeling capability is limited, as a single code is applied to the entire image. Recently, the bilateral grid was introduced to perform pixel-wise color mapping, but it is difficult to optimize and constrain effectively. In this paper, we propose a novel multi-scale bilateral grid that unifies appearance codes and bilateral grids. We demonstrate that this approach significantly improves geometric accuracy in dynamic, decoupled autonomous driving scene reconstruction, outperforming both appearance codes and bilateral grids. This is crucial for autonomous driving, where accurate geometry is important for obstacle avoidance and control. Our method shows strong results across four datasets: Waymo, NuScenes, Argoverse, and PandaSet. We further demonstrate that the improvement in geometry is driven by the multi-scale bilateral grid, which effectively reduces floaters caused by photometric inconsistency.

Paper Structure

This paper contains 33 sections, 16 equations, 11 figures, 19 tables.

Figures (11)

  • Figure 1: Unifying appearance codes and bilateral grids. (a) Appearance codes rely on global affine transformations but have limited modeling capability. (b) Bilateral grids perform pixel-wise transformations, improving photometric consistency but are challenging to optimize. (c) The proposed multi-scale bilateral grid unifies both paradigms, enabling patch-wise transformations.
  • Figure 2: Overview of our method. We unify appearance codes with multi-scale bilateral grids. A coarse rendering from the Gaussian scene graph is refined by multi-scale bilateral grids to model per-pixel color with a luminance-guided slice-and-fuse pipeline.
  • Figure 3: Qualitative comparison across datasets. Our method versus baselines on Waymo, NuScenes, Argoverse, and PandaSet.
  • Figure 4: Photometric consistency improves geometry. Our framework (c) outperforms appearance codes (a) and single bilateral grids (b) by addressing optimization challenges and enhancing geometric modeling. This yields lower Chamfer Distance and fewer floaters in dynamic, decoupled driving scenes. Yellow indicates high LiDAR error, while Purple indicates low LiDAR error.
  • Figure 5: Ablation visualizations. Effects of circle regularization and adaptive total variation losses.
  • ...and 6 more figures