Table of Contents
Fetching ...

FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes

Lue Fan, Hao Zhang, Qitai Wang, Hongsheng Li, Zhaoxiang Zhang

TL;DR

FreeSim tackles the problem of realistic off-trajectory camera rendering in driving scenes by converting pose-conditioned view generation into a generative image enhancement task and coupling it with a progressive reconstruction pipeline. It constructs a large matched training dataset by degrading on-trajectory renderings to imitate off-trajectory patterns and augments geometry with a sparse LiDAR cue, enabling stable diffusion-based enhancement conditioned on $I_d$ and $I_l$. A progressive reconstruction loop refines images while expanding viewpoints from near to far off-trajectory positions, with a post-enhancement stage to mitigate rolling shutter and generative randomness. Empirical results on the Waymo Open Dataset show FreeSim outperforming baselines in off-trajectory FID while maintaining competitive on-trajectory PSNR/SSIM, and ablations confirm the benefits of degradation strategies, LiDAR conditioning, and progressive expansion. The approach advances practical free-viewpoint camera simulation by enabling high-quality rendering beyond recorded trajectories, with implications for training and evaluating autonomous-driving systems, though it notes limitations in rolling shutter and generalization to non-Gaussian data sources.

Abstract

We propose FreeSim, a camera simulation method for autonomous driving. FreeSim emphasizes high-quality rendering from viewpoints beyond the recorded ego trajectories. In such viewpoints, previous methods have unacceptable degradation because the training data of these viewpoints is unavailable. To address such data scarcity, we first propose a generative enhancement model with a matched data construction strategy. The resulting model can generate high-quality images in a viewpoint slightly deviated from the recorded trajectories, conditioned on the degraded rendering of this viewpoint. We then propose a progressive reconstruction strategy, which progressively adds generated images of unrecorded views into the reconstruction process, starting from slightly off-trajectory viewpoints and moving progressively farther away. With this progressive generation-reconstruction pipeline, FreeSim supports high-quality off-trajectory view synthesis under large deviations of more than 3 meters.

FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes

TL;DR

FreeSim tackles the problem of realistic off-trajectory camera rendering in driving scenes by converting pose-conditioned view generation into a generative image enhancement task and coupling it with a progressive reconstruction pipeline. It constructs a large matched training dataset by degrading on-trajectory renderings to imitate off-trajectory patterns and augments geometry with a sparse LiDAR cue, enabling stable diffusion-based enhancement conditioned on and . A progressive reconstruction loop refines images while expanding viewpoints from near to far off-trajectory positions, with a post-enhancement stage to mitigate rolling shutter and generative randomness. Empirical results on the Waymo Open Dataset show FreeSim outperforming baselines in off-trajectory FID while maintaining competitive on-trajectory PSNR/SSIM, and ablations confirm the benefits of degradation strategies, LiDAR conditioning, and progressive expansion. The approach advances practical free-viewpoint camera simulation by enabling high-quality rendering beyond recorded trajectories, with implications for training and evaluating autonomous-driving systems, though it notes limitations in rolling shutter and generalization to non-Gaussian data sources.

Abstract

We propose FreeSim, a camera simulation method for autonomous driving. FreeSim emphasizes high-quality rendering from viewpoints beyond the recorded ego trajectories. In such viewpoints, previous methods have unacceptable degradation because the training data of these viewpoints is unavailable. To address such data scarcity, we first propose a generative enhancement model with a matched data construction strategy. The resulting model can generate high-quality images in a viewpoint slightly deviated from the recorded trajectories, conditioned on the degraded rendering of this viewpoint. We then propose a progressive reconstruction strategy, which progressively adds generated images of unrecorded views into the reconstruction process, starting from slightly off-trajectory viewpoints and moving progressively farther away. With this progressive generation-reconstruction pipeline, FreeSim supports high-quality off-trajectory view synthesis under large deviations of more than 3 meters.

Paper Structure

This paper contains 40 sections, 2 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: The proposed FreeSim can obtain high-quality camera simulation results in viewpoints largely deviated from the recorded trajectories. Here we adopt pioneering work PVG chen2023periodic as the baseline for demonstration. We highlight the delicate power lines with orange bounding boxes.
  • Figure 2: The overall framework of FreeSim. The black line indicates the recorded trajectory. Starting from it, the training viewpoints are progressively expanded to far away unrecorded trajectories (blue and red lines). The generative model produces high-quality images for the new viewpoints after each expansion. The progressive expansion can be conducted repeatedly, and we only illustrate two stages for simplicity. We use real images from our experiments in this illustration, so readers can zoom in to see the evolution of the four images for an intuitive understanding.
  • Figure 3: Piece-wise Gaussian field reconstruction.
  • Figure 4: Object ghosting caused by inaccurate depth. The two images are rendered by the original PVG.
  • Figure 5: Qualitative comparison. The two scenes are captured by the front-left camera and the front camera, respectively. To showcase our performance, we adopt a quite large viewpoint change. However, those views with orange bounding boxes get completely ruined with such a large deviation. Thus, we slightly adjust their camera poses.
  • ...and 5 more figures