FreeVS: Generative View Synthesis on Free Driving Trajectory
Qitai Wang, Lue Fan, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang
TL;DR
This paper tackles the limitation of reconstruction-based novel view synthesis in driving scenes, where views outside the ego trajectory lack training data. It introduces FreeVS, a fully generative diffusion-based framework that uses pseudo-image priors derived from colored LiDAR projections to encode appearance, geometry, and pose, enabling 3D-coherent view synthesis on free trajectories and missing views. A viewpoint transformation simulation augments training for unseen viewpoint shifts, and two driving-specific benchmarks—novel camera synthesis and novel trajectory synthesis—assess performance beyond recorded paths. Experiments on the Waymo Open Dataset show FreeVS outperforms prior NVS methods across both traditional novel-frame synthesis and the new benchmarks, while eschewing costly per-sequence reconstruction and offering strong 3D consistency.
Abstract
Existing reconstruction-based novel view synthesis methods for driving scenes focus on synthesizing camera views along the recorded trajectory of the ego vehicle. Their image rendering performance will severely degrade on viewpoints falling out of the recorded trajectory, where camera rays are untrained. We propose FreeVS, a novel fully generative approach that can synthesize camera views on free new trajectories in real driving scenes. To control the generation results to be 3D consistent with the real scenes and accurate in viewpoint pose, we propose the pseudo-image representation of view priors to control the generation process. Viewpoint transformation simulation is applied on pseudo-images to simulate camera movement in each direction. Once trained, FreeVS can be applied to any validation sequences without reconstruction process and synthesis views on novel trajectories. Moreover, we propose two new challenging benchmarks tailored to driving scenes, which are novel camera synthesis and novel trajectory synthesis, emphasizing the freedom of viewpoints. Given that no ground truth images are available on novel trajectories, we also propose to evaluate the consistency of images synthesized on novel trajectories with 3D perception models. Experiments on the Waymo Open Dataset show that FreeVS has a strong image synthesis performance on both the recorded trajectories and novel trajectories. Project Page: https://freevs24.github.io/
