MFSR-GAN: Multi-Frame Super-Resolution with Handheld Motion Modeling
Fadeel Sher Khan, Joshua Ebenezer, Hamid Sheikh, Seok-Jun Lee
TL;DR
This work tackles real-world handheld multi-frame super-resolution by bridging a realism gap in training data and improving frame fusion. It introduces a synthetic data engine that preserves sensor-specific noise and temporally correlated handheld motion by warping high-resolution static captures with real handheld homographies and nearest-neighbor downsampling, paired with an 8-frame RAW-to-RGB MFSR-GAN. The network emphasizes a base frame through Reference Difference Computation and deformable-alignment-based fusion across multiple scales, aided by RRDB-based reconstruction and a relativistic GAN framework for perceptual quality. Experiments on both synthetic and real handheld bursts show sharper, more realistic reconstructions than prior methods, highlighting the practical potential for improved smartphone imaging under challenging conditions.
Abstract
Smartphone cameras have become ubiquitous imaging tools, yet their small sensors and compact optics often limit spatial resolution and introduce distortions. Combining information from multiple low-resolution (LR) frames to produce a high-resolution (HR) image has been explored to overcome the inherent limitations of smartphone cameras. Despite the promise of multi-frame super-resolution (MFSR), current approaches are hindered by datasets that fail to capture the characteristic noise and motion patterns found in real-world handheld burst images. In this work, we address this gap by introducing a novel synthetic data engine that uses multi-exposure static images to synthesize LR-HR training pairs while preserving sensor-specific noise characteristics and image motion found during handheld burst photography. We also propose MFSR-GAN: a multi-scale RAW-to-RGB network for MFSR. Compared to prior approaches, MFSR-GAN emphasizes a "base frame" throughout its architecture to mitigate artifacts. Experimental results on both synthetic and real data demonstrates that MFSR-GAN trained with our synthetic engine yields sharper, more realistic reconstructions than existing methods for real-world MFSR.
