Table of Contents
Fetching ...

Single Image Rolling Shutter Removal with Diffusion Models

Zhanglei Yang, Haipeng Li, Mingbo Hong, Chen-Lin Zhang, Jiajun Li, Shuaicheng Liu

TL;DR

RS-Diffusion presents the first diffusion-model approach to single-image rolling shutter removal by formulating an image-to-motion task conditioned on the RS frame and augmented with a Patch-Attention module. The Intra Gyro Field (IGF) pipeline enables accurate GS ground-truth labeling via synchronized IMU data, culminating in the RS-Real dataset that combines realism with precise motion labels. Empirical results demonstrate state-of-the-art performance for single-frame RS correction and real-time inference on GPUs, along with robust ablations validating the IGF-driven supervision and patch-attention design. The work advances practical RS correction and provides a valuable dataset for future diffusion-based restoration research.

Abstract

We present RS-Diffusion, the first Diffusion Models-based method for single-frame Rolling Shutter (RS) correction. RS artifacts compromise visual quality of frames due to the row-wise exposure of CMOS sensors. Most previous methods have focused on multi-frame approaches, using temporal information from consecutive frames for the motion rectification. However, few approaches address the more challenging but important single frame RS correction. In this work, we present an ``image-to-motion" framework via diffusion techniques, with a designed patch-attention module. In addition, we present the RS-Real dataset, comprised of captured RS frames alongside their corresponding Global Shutter (GS) ground-truth pairs. The GS frames are corrected from the RS ones, guided by the corresponding Inertial Measurement Unit (IMU) gyroscope data acquired during capture. Experiments show that RS-Diffusion surpasses previous single-frame RS methods, demonstrates the potential of diffusion-based approaches, and provides a valuable dataset for further research.

Single Image Rolling Shutter Removal with Diffusion Models

TL;DR

RS-Diffusion presents the first diffusion-model approach to single-image rolling shutter removal by formulating an image-to-motion task conditioned on the RS frame and augmented with a Patch-Attention module. The Intra Gyro Field (IGF) pipeline enables accurate GS ground-truth labeling via synchronized IMU data, culminating in the RS-Real dataset that combines realism with precise motion labels. Empirical results demonstrate state-of-the-art performance for single-frame RS correction and real-time inference on GPUs, along with robust ablations validating the IGF-driven supervision and patch-attention design. The work advances practical RS correction and provides a valuable dataset for future diffusion-based restoration research.

Abstract

We present RS-Diffusion, the first Diffusion Models-based method for single-frame Rolling Shutter (RS) correction. RS artifacts compromise visual quality of frames due to the row-wise exposure of CMOS sensors. Most previous methods have focused on multi-frame approaches, using temporal information from consecutive frames for the motion rectification. However, few approaches address the more challenging but important single frame RS correction. In this work, we present an ``image-to-motion" framework via diffusion techniques, with a designed patch-attention module. In addition, we present the RS-Real dataset, comprised of captured RS frames alongside their corresponding Global Shutter (GS) ground-truth pairs. The GS frames are corrected from the RS ones, guided by the corresponding Inertial Measurement Unit (IMU) gyroscope data acquired during capture. Experiments show that RS-Diffusion surpasses previous single-frame RS methods, demonstrates the potential of diffusion-based approaches, and provides a valuable dataset for further research.
Paper Structure (19 sections, 11 equations, 6 figures, 5 tables)

This paper contains 19 sections, 11 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Illustration of the proposed dataset and our results. The first row features a rolling-shutter (RS) image captured in realistic scenes, along with the corresponding ground-truth global-shutter (GS) image. The ground-truth flow used for correcting the RS image to the GS image is displayed on the left side of the second row. In the bottom right, we showcase our corrected RS image.
  • Figure 2: Rolling shutter image $\mathbf{I}_{RS}$, is introduced by high-frequency shake with a row-wise exposure CMOS camera. The gyroscope can accurately record these motions, which are then transformed into a motion field, $\mathbf{G} \in \mathbf{R}^{2 \times H \times W}$. This field is referred to as the Intra Gyro Field (IGF). With $\mathbf{G}$, we are able to correct $\mathbf{I}_{RS}$, resulting in a Global Shutter Image, $\mathbf{I}_{GS}$.
  • Figure 3: Illustration of the framework: During training, $\mathbf{x}_0$ undergoes forward diffusion to become $\mathbf{x}_t$. The network $\mathbf{\theta}$ processes the concatenated input. The Patch-Attention module, which includes both Intra-Patch and Inter-Patch attention mechanisms, enhances the relationships between patches. The resulting output, $\mathbf{\hat{x}_0}$, can be used to correct $\mathbf{I}_{RS}$. The loss function comprises the MSELoss, calculated between $\mathbf{\hat{x}_0}$ and $\mathbf{x}_0$, and the photometric loss, computed between the GT GS image and $\mathbf{I^{\prime}}_{GS}$.
  • Figure 4: A glance at the RS-Real dataset reveals that it contains data pairs featuring various RS motion patterns and different intensities, all of which are captured in diverse scenes.
  • Figure 5: Comparison with existing methods: 1. Rengarajan et al.rengarajan2016bows, 2. Rengarajan et al.rengarajan2017unrolling, 3. Purkait et al.purkait2017rolling, 4. Grundmann et al.grundmann2012calibration and 5. Kandula et al.kandula2020deep on a RS building image. Red vertical lines to highlight correction results.
  • ...and 1 more figures