Table of Contents
Fetching ...

FreeVS: Generative View Synthesis on Free Driving Trajectory

Qitai Wang, Lue Fan, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang

TL;DR

This paper tackles the limitation of reconstruction-based novel view synthesis in driving scenes, where views outside the ego trajectory lack training data. It introduces FreeVS, a fully generative diffusion-based framework that uses pseudo-image priors derived from colored LiDAR projections to encode appearance, geometry, and pose, enabling 3D-coherent view synthesis on free trajectories and missing views. A viewpoint transformation simulation augments training for unseen viewpoint shifts, and two driving-specific benchmarks—novel camera synthesis and novel trajectory synthesis—assess performance beyond recorded paths. Experiments on the Waymo Open Dataset show FreeVS outperforms prior NVS methods across both traditional novel-frame synthesis and the new benchmarks, while eschewing costly per-sequence reconstruction and offering strong 3D consistency.

Abstract

Existing reconstruction-based novel view synthesis methods for driving scenes focus on synthesizing camera views along the recorded trajectory of the ego vehicle. Their image rendering performance will severely degrade on viewpoints falling out of the recorded trajectory, where camera rays are untrained. We propose FreeVS, a novel fully generative approach that can synthesize camera views on free new trajectories in real driving scenes. To control the generation results to be 3D consistent with the real scenes and accurate in viewpoint pose, we propose the pseudo-image representation of view priors to control the generation process. Viewpoint transformation simulation is applied on pseudo-images to simulate camera movement in each direction. Once trained, FreeVS can be applied to any validation sequences without reconstruction process and synthesis views on novel trajectories. Moreover, we propose two new challenging benchmarks tailored to driving scenes, which are novel camera synthesis and novel trajectory synthesis, emphasizing the freedom of viewpoints. Given that no ground truth images are available on novel trajectories, we also propose to evaluate the consistency of images synthesized on novel trajectories with 3D perception models. Experiments on the Waymo Open Dataset show that FreeVS has a strong image synthesis performance on both the recorded trajectories and novel trajectories. Project Page: https://freevs24.github.io/

FreeVS: Generative View Synthesis on Free Driving Trajectory

TL;DR

This paper tackles the limitation of reconstruction-based novel view synthesis in driving scenes, where views outside the ego trajectory lack training data. It introduces FreeVS, a fully generative diffusion-based framework that uses pseudo-image priors derived from colored LiDAR projections to encode appearance, geometry, and pose, enabling 3D-coherent view synthesis on free trajectories and missing views. A viewpoint transformation simulation augments training for unseen viewpoint shifts, and two driving-specific benchmarks—novel camera synthesis and novel trajectory synthesis—assess performance beyond recorded paths. Experiments on the Waymo Open Dataset show FreeVS outperforms prior NVS methods across both traditional novel-frame synthesis and the new benchmarks, while eschewing costly per-sequence reconstruction and offering strong 3D consistency.

Abstract

Existing reconstruction-based novel view synthesis methods for driving scenes focus on synthesizing camera views along the recorded trajectory of the ego vehicle. Their image rendering performance will severely degrade on viewpoints falling out of the recorded trajectory, where camera rays are untrained. We propose FreeVS, a novel fully generative approach that can synthesize camera views on free new trajectories in real driving scenes. To control the generation results to be 3D consistent with the real scenes and accurate in viewpoint pose, we propose the pseudo-image representation of view priors to control the generation process. Viewpoint transformation simulation is applied on pseudo-images to simulate camera movement in each direction. Once trained, FreeVS can be applied to any validation sequences without reconstruction process and synthesis views on novel trajectories. Moreover, we propose two new challenging benchmarks tailored to driving scenes, which are novel camera synthesis and novel trajectory synthesis, emphasizing the freedom of viewpoints. Given that no ground truth images are available on novel trajectories, we also propose to evaluate the consistency of images synthesized on novel trajectories with 3D perception models. Experiments on the Waymo Open Dataset show that FreeVS has a strong image synthesis performance on both the recorded trajectories and novel trajectories. Project Page: https://freevs24.github.io/

Paper Structure

This paper contains 23 sections, 1 equation, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Synthesis results comparison on the Waymo Open Datasetsun2020scalability. We show the camera views synthesized by NVS methods on the original front view (first row), viewpoint 1.0 m to the right (second row), and viewpoint 1.0 m above (third tow). Our method significantly outperforms previous NVS methods on viewpoint outside the existing ego trajectory.
  • Figure 2: Method pipeline of FreeVS. We propose to encode the view priors in driving scenes including appearance, 3D geometry, and pose of target viewpoints all in one modality as the pseudo-images. Best viewed in color. The diffusion model is trained to synthesize target views solely based on the unified pseudo-image priors.
  • Figure 3: Benchmarks for evaluating NVS methods in driving scenes. We conclude the previous NVS evaluation benchmarks for driving scenes as (a) and (b). We propose two novel evaluation benchmarks: the novel camera synthesis benchmark as in (c), and the novel trajectory synthesis benchmark as in (d). Best viewed in color.
  • Figure 4: Visualization comparison on novel-camera synthesis benchmark. We show the front-side camera views synthesized from front and side camera views with NVS methods.
  • Figure 5: Visualization comparison on novel trajectories. We show the camera views synthesized by NVS methods on the original training viewpoint, viewpoint 2.0 m left of the original viewpoint, and viewpoint 2.0 m right of the original viewpoint.
  • ...and 4 more figures