XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis
Hao Li, Chenming Wu, Ming Yuan, Yan Zhang, Chen Zhao, Chunyu Song, Haocheng Feng, Errui Ding, Dingwen Zhang, Jingdong Wang
TL;DR
XLD introduces a cross-lane driving dataset and benchmark to evaluate novel view synthesis under closed-loop autonomous driving conditions, addressing the gap where existing NVS benchmarks only interpolate within training trajectories. Built in CARLA, the dataset provides six scenes with cross-trajectory testing offsets of $0\mathrm{m}$ to $4\mathrm{m}$, three RGB cameras, and LiDAR, enabling front-only and multi-camera evaluation in diverse weather. Benchmark results show NeRF-based methods generally outperform 3D Gaussian Splatting in cross-lane scenarios, with EmerNeRF and UC-NeRF delivering robust cross-lane performance while Gaussian baselines can overfit and degrade with offsets; multi-camera training improves cross-lane fidelity, and precise geometry is key for realistic cross-lane rendering. PVG’s self-decomposition improves background rendering but can still falter at larger offsets, highlighting the ongoing need for geometry-aware, cross-trajectory NVS methods. The work provides a valuable platform for advancing NVS toward realistic closed-loop autonomous driving simulations and motivates future real-world cross-lane ground-truth datasets to validate these synthetic benchmarks.
Abstract
Comprehensive testing of autonomous systems through simulation is essential to ensure the safety of autonomous driving vehicles. This requires the generation of safety-critical scenarios that extend beyond the limitations of real-world data collection, as many of these scenarios are rare or rarely encountered on public roads. However, evaluating most existing novel view synthesis (NVS) methods relies on sporadic sampling of image frames from the training data, comparing the rendered images with ground-truth images. Unfortunately, this evaluation protocol falls short of meeting the actual requirements in closed-loop simulations. Specifically, the true application demands the capability to render novel views that extend beyond the original trajectory (such as cross-lane views), which are challenging to capture in the real world. To address this, this paper presents a synthetic dataset for novel driving view synthesis evaluation, which is specifically designed for autonomous driving simulations. This unique dataset includes testing images captured by deviating from the training trajectory by $1-4$ meters. It comprises six sequences that cover various times and weather conditions. Each sequence contains $450$ training images, $120$ testing images, and their corresponding camera poses and intrinsic parameters. Leveraging this novel dataset, we establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multicamera settings. The experimental findings underscore the significant gap in current approaches, revealing their inadequate ability to fulfill the demanding prerequisites of cross-lane or closed-loop simulation.
