FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Lue Fan, Hao Zhang, Qitai Wang, Hongsheng Li, Zhaoxiang Zhang
TL;DR
FreeSim tackles the problem of realistic off-trajectory camera rendering in driving scenes by converting pose-conditioned view generation into a generative image enhancement task and coupling it with a progressive reconstruction pipeline. It constructs a large matched training dataset by degrading on-trajectory renderings to imitate off-trajectory patterns and augments geometry with a sparse LiDAR cue, enabling stable diffusion-based enhancement conditioned on $I_d$ and $I_l$. A progressive reconstruction loop refines images while expanding viewpoints from near to far off-trajectory positions, with a post-enhancement stage to mitigate rolling shutter and generative randomness. Empirical results on the Waymo Open Dataset show FreeSim outperforming baselines in off-trajectory FID while maintaining competitive on-trajectory PSNR/SSIM, and ablations confirm the benefits of degradation strategies, LiDAR conditioning, and progressive expansion. The approach advances practical free-viewpoint camera simulation by enabling high-quality rendering beyond recorded trajectories, with implications for training and evaluating autonomous-driving systems, though it notes limitations in rolling shutter and generalization to non-Gaussian data sources.
Abstract
We propose FreeSim, a camera simulation method for autonomous driving. FreeSim emphasizes high-quality rendering from viewpoints beyond the recorded ego trajectories. In such viewpoints, previous methods have unacceptable degradation because the training data of these viewpoints is unavailable. To address such data scarcity, we first propose a generative enhancement model with a matched data construction strategy. The resulting model can generate high-quality images in a viewpoint slightly deviated from the recorded trajectories, conditioned on the degraded rendering of this viewpoint. We then propose a progressive reconstruction strategy, which progressively adds generated images of unrecorded views into the reconstruction process, starting from slightly off-trajectory viewpoints and moving progressively farther away. With this progressive generation-reconstruction pipeline, FreeSim supports high-quality off-trajectory view synthesis under large deviations of more than 3 meters.
