Super-Resolving Blurry Images with Events
Chi Zhang, Mingyuan Lin, Xiang Zhang, Chenxu Jiang, Lei Yu
TL;DR
This work tackles the problem of recovering HR sharp images from motion-blurred LR inputs by leveraging high-temporal-resolution event data. It introduces EBSR-Net, a one-stage architecture that fuses blurred frames and events via a Multi-scale Center-surround Event Representation (MCER), Symmetric Cross-Modal Attention (SCMA), and an Intermodal Residual Group (IRG) built from Residual Dense Swin Transformer blocks. The method demonstrates substantial PSNR/SSIM improvements over state-of-the-art baselines on GoPro and REDS datasets, while maintaining a compact model and reasonable compute. The results indicate that cross-modal fusion with carefully designed event representations yields robust MSR under complex, non-uniform motion, with potential impact on视觉 tasks in dynamic environments.
Abstract
Super-resolution from motion-blurred images poses a significant challenge due to the combined effects of motion blur and low spatial resolution. To address this challenge, this paper introduces an Event-based Blurry Super Resolution Network (EBSR-Net), which leverages the high temporal resolution of events to mitigate motion blur and improve high-resolution image prediction. Specifically, we propose a multi-scale center-surround event representation to fully capture motion and texture information inherent in events. Additionally, we design a symmetric cross-modal attention module to fully exploit the complementarity between blurry images and events. Furthermore, we introduce an intermodal residual group composed of several residual dense Swin Transformer blocks, each incorporating multiple Swin Transformer layers and a residual connection, to extract global context and facilitate inter-block feature aggregation. Extensive experiments show that our method compares favorably against state-of-the-art approaches and achieves remarkable performance.
