Table of Contents
Fetching ...

USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields

Moyang Li, Peng Wang, Lingzhe Zhao, Bangyan Liao, Peidong Liu

TL;DR

This paper tackles the challenge of rolling shutter distortions in Neural Radiance Fields (NeRF) by proposing USB-NeRF, a framework that jointly learns a 3D scene and a continuous-time camera trajectory within an RS-aware differentiable image formation model. The method represents the trajectory with a cubic B-Spline in $SE(3)$, enabling row-wise poses for RS frames and facilitating end-to-end optimization without pretraining. Key contributions include integrating RS camera modeling into NeRF training, achieving improved RS removal, novel-view synthesis, and camera motion estimation, and enabling high-frame-rate global shutter video reconstruction. Empirical results on synthetic and real datasets show consistent gains over state-of-the-art RS correction baselines and NeRF variants, highlighting strong generalization and practical impact for RS footage in 3D reconstruction and video synthesis.

Abstract

Neural Radiance Fields (NeRF) has received much attention recently due to its impressive capability to represent 3D scene and synthesize novel view images. Existing works usually assume that the input images are captured by a global shutter camera. Thus, rolling shutter (RS) images cannot be trivially applied to an off-the-shelf NeRF algorithm for novel view synthesis. Rolling shutter effect would also affect the accuracy of the camera pose estimation (e.g. via COLMAP), which further prevents the success of NeRF algorithm with RS images. In this paper, we propose Unrolling Shutter Bundle Adjusted Neural Radiance Fields (USB-NeRF). USB-NeRF is able to correct rolling shutter distortions and recover accurate camera motion trajectory simultaneously under the framework of NeRF, by modeling the physical image formation process of a RS camera. Experimental results demonstrate that USB-NeRF achieves better performance compared to prior works, in terms of RS effect removal, novel view image synthesis as well as camera motion estimation. Furthermore, our algorithm can also be used to recover high-fidelity high frame-rate global shutter video from a sequence of RS images.

USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields

TL;DR

This paper tackles the challenge of rolling shutter distortions in Neural Radiance Fields (NeRF) by proposing USB-NeRF, a framework that jointly learns a 3D scene and a continuous-time camera trajectory within an RS-aware differentiable image formation model. The method represents the trajectory with a cubic B-Spline in , enabling row-wise poses for RS frames and facilitating end-to-end optimization without pretraining. Key contributions include integrating RS camera modeling into NeRF training, achieving improved RS removal, novel-view synthesis, and camera motion estimation, and enabling high-frame-rate global shutter video reconstruction. Empirical results on synthetic and real datasets show consistent gains over state-of-the-art RS correction baselines and NeRF variants, highlighting strong generalization and practical impact for RS footage in 3D reconstruction and video synthesis.

Abstract

Neural Radiance Fields (NeRF) has received much attention recently due to its impressive capability to represent 3D scene and synthesize novel view images. Existing works usually assume that the input images are captured by a global shutter camera. Thus, rolling shutter (RS) images cannot be trivially applied to an off-the-shelf NeRF algorithm for novel view synthesis. Rolling shutter effect would also affect the accuracy of the camera pose estimation (e.g. via COLMAP), which further prevents the success of NeRF algorithm with RS images. In this paper, we propose Unrolling Shutter Bundle Adjusted Neural Radiance Fields (USB-NeRF). USB-NeRF is able to correct rolling shutter distortions and recover accurate camera motion trajectory simultaneously under the framework of NeRF, by modeling the physical image formation process of a RS camera. Experimental results demonstrate that USB-NeRF achieves better performance compared to prior works, in terms of RS effect removal, novel view image synthesis as well as camera motion estimation. Furthermore, our algorithm can also be used to recover high-fidelity high frame-rate global shutter video from a sequence of RS images.
Paper Structure (18 sections, 9 equations, 14 figures, 10 tables)

This paper contains 18 sections, 9 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Given a sequence of rolling shutter images, our method is able to simultaneously learn the undistorted 3D scene representation and recover the continuous-time camera motion trajectory. Global shutter images with removed rolling shutter effect can then be rendered from the learned 3D representation. The third row presents residual images (the darker the better) that are defined as the absolute difference between the corresponding images ( first row) and ground truth global shutter images.
  • Figure 2: The pipeline of USB-NeRF. Given a sequence of rolling shutter images, we train NeRF to learn the underlying undistorted 3D scene representations. We parameterize the camera motion trajectory of the image sequence by a continuous-time cubic B-Spline in $\textbf{SE}(3)$ space. Given the capturing time for each row of the rolling shutter image, we can interpolate its pose from the spline. Each rolling shutter image can then be synthesized by rendering all the image rows (i.e. each with different poses) from NeRF. By maximizing the photo-metric consistency between the synthesized and captured RS images, we can learn the underlying 3D scene representation and recover the camera motion trajectory. Global shutter images can then be rendered from the learned 3D representation with known camera poses.
  • Figure 3: Image formation models of a rolling shutter camera and a global shutter camera respectively. It demonstrates that each row of a rolling shutter image is captured at different timestamps, and would thus lead to image distortions if the image is captured by a moving camera.
  • Figure 4: Qualitative comparisons with Carla-RS datasets liu2020deepunroll and Unreal-RS datasets. The experimental results demonstrate that our method achieves better performance compared to prior works. The darker the $3^{rd}$ and the $6^{th}$ rows, the performance is better.
  • Figure 5: Qualitative comparisons with real TUM-RS datasets schubert2019RS-VIO. Since the dataset does not have pixel-aligned rolling-global shutter image pairs, we choose the nearest neighbor global shutter images for comparisons. The experimental results demonstrate that RSSR and CVR fail to correct the RS effect due to their poor generalization performance. BARF also fails since it does not consider the rolling shutter camera model, while our method successfully removes the RS effect.
  • ...and 9 more figures