Table of Contents
Fetching ...

Motion Blur Decomposition with Cross-shutter Guidance

Xiang Ji, Haiyang Jiang, Yinqiang Zheng

TL;DR

Motion blur decomposition is highly ill-posed due to temporal ordering and motion ambiguity. The authors propose a cross-shutter framework that jointly leverages rolling shutter guidance and global shutter content to reconstruct a sharp video sequence from a single blurred frame, supported by a triaxial imaging system and a RealBR dataset. The core method combines dual-stream motion interpretation with shutter-alignment and temporal encoding, refined by a GenNet, achieving substantial gains over state-of-the-art methods on real and synthetic data. The work demonstrates robustness to misalignment and low-light noise, and discusses hardware considerations and future directions for mobile and compact systems. Overall, this cross-shutter approach offers a practical path toward motion-aware deblurring and temporal super-resolution in real-world imaging settings.

Abstract

Motion blur is a frequently observed image artifact, especially under insufficient illumination where exposure time has to be prolonged so as to collect more photons for a bright enough image. Rather than simply removing such blurring effects, recent researches have aimed at decomposing a blurry image into multiple sharp images with spatial and temporal coherence. Since motion blur decomposition itself is highly ambiguous, priors from neighbouring frames or human annotation are usually needed for motion disambiguation. In this paper, inspired by the complementary exposure characteristics of a global shutter (GS) camera and a rolling shutter (RS) camera, we propose to utilize the ordered scanline-wise delay in a rolling shutter image to robustify motion decomposition of a single blurry image. To evaluate this novel dual imaging setting, we construct a triaxial system to collect realistic data, as well as a deep network architecture that explicitly addresses temporal and contextual information through reciprocal branches for cross-shutter motion blur decomposition. Experiment results have verified the effectiveness of our proposed algorithm, as well as the validity of our dual imaging setting.

Motion Blur Decomposition with Cross-shutter Guidance

TL;DR

Motion blur decomposition is highly ill-posed due to temporal ordering and motion ambiguity. The authors propose a cross-shutter framework that jointly leverages rolling shutter guidance and global shutter content to reconstruct a sharp video sequence from a single blurred frame, supported by a triaxial imaging system and a RealBR dataset. The core method combines dual-stream motion interpretation with shutter-alignment and temporal encoding, refined by a GenNet, achieving substantial gains over state-of-the-art methods on real and synthetic data. The work demonstrates robustness to misalignment and low-light noise, and discusses hardware considerations and future directions for mobile and compact systems. Overall, this cross-shutter approach offers a practical path toward motion-aware deblurring and temporal super-resolution in real-world imaging settings.

Abstract

Motion blur is a frequently observed image artifact, especially under insufficient illumination where exposure time has to be prolonged so as to collect more photons for a bright enough image. Rather than simply removing such blurring effects, recent researches have aimed at decomposing a blurry image into multiple sharp images with spatial and temporal coherence. Since motion blur decomposition itself is highly ambiguous, priors from neighbouring frames or human annotation are usually needed for motion disambiguation. In this paper, inspired by the complementary exposure characteristics of a global shutter (GS) camera and a rolling shutter (RS) camera, we propose to utilize the ordered scanline-wise delay in a rolling shutter image to robustify motion decomposition of a single blurry image. To evaluate this novel dual imaging setting, we construct a triaxial system to collect realistic data, as well as a deep network architecture that explicitly addresses temporal and contextual information through reciprocal branches for cross-shutter motion blur decomposition. Experiment results have verified the effectiveness of our proposed algorithm, as well as the validity of our dual imaging setting.
Paper Structure (19 sections, 10 equations, 8 figures, 6 tables)

This paper contains 19 sections, 10 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Motion ambiguity of blur observation. In this toy example, we show two objects: a soccer and a player, both moving horizontally. (a) shows four possible motion states (both moving right, both moving left, one moving left and the other is towards right.) during the exposure time. (b) presents corresponding motion blurred observations. They are all identical due to averaging effects, which brings about motion ambiguity to blur decomposition. (c) In our dual Blur-RS setting, rolling shutter (RS) view implicitly encoded temporal ordering of latent frames.
  • Figure 2: Our triaxial imaging system. (a) A photo of the actual system for data gathering; (b) Optical diagram of our system; (c) Illustration of exposure duration for all cameras on temporal axis. In picture (c), its vertical axes can be interpreted as spatial rows of captured images from each camera.
  • Figure 3: Our proposed model. (a) shows the overall architecture containing two stages: motion interpretation and blur decomposition. Blur decomposition is implemented through a GenNet. Motion interpretation takes as input a blur image $B$ and an RS image $R$ along with its temporal positional encoding $E$. It consists of three blocks and one of them is unfolded in (b). (c) presents specific details of shutter alignment and aggregation (SAA). Feature extracted by encoder block (EB) will be converted using spatial transformer network (STN), and then enhanced through a Conv. block to accurately predict displacement field between shutters.
  • Figure 4: Qualitative comparison. Our model outperforms the approaches approximating latent motion fields relying on adjacent blurry inputs. Especially, RIFE$_{BR}$ and IFED$_{BR}$ implemented by dual Blur-RS view reconstruct much sharper details than RIFE$_B$ and IFED$_B$.
  • Figure 5: PSNR distribution of our method with one aligned ('Shift-$0$') and three misaligned views('Shift-$4$', 'Shift-$6$' and 'Shift-$8$') under a selected sequence. The horizontal axis is initial PSNR computed by blur view and the first latent frame while the vertical axis denotes PSNR computed between corrected blur view and its ground truth.
  • ...and 3 more figures