Ultra-High-Definition Dynamic Multi-Exposure Image Fusion via Infinite Pixel Learning
Xingchi Chen, Zhuoran Zheng, Xuerui Li, Yuying Chen, Shu Wang, Wenqi Ren
TL;DR
This work tackles UHD dynamic multi-exposure image fusion on resource-constrained hardware by introducing Infinite Pixel Learning (IPL), a chunk-cache-quantization pipeline inspired by long-sequence processing in LLMs. IPL leverages a Slice Cyclic Scanner for dimensional attention, an Attention Cache to avoid redundant computation, and Quantization Compression to manage memory, complemented by a Dimensional Rolling Transformation Module to preserve global context. The authors present the 4K-DMEF UHD benchmark and demonstrate that IPL achieves full-resolution UHD fusion on a single GPU at real-time speeds with substantial gains in PSNR, SSIM, and perceptual quality over state-of-the-art methods. This approach provides a practical path to high-quality UHD MEF on commodity hardware and offers a robust benchmark to accelerate future UHD dynamic fusion research.
Abstract
With the continuous improvement of device imaging resolution, the popularity of Ultra-High-Definition (UHD) images is increasing. Unfortunately, existing methods for fusing multi-exposure images in dynamic scenes are designed for low-resolution images, which makes them inefficient for generating high-quality UHD images on a resource-constrained device. To alleviate the limitations of extremely long-sequence inputs, inspired by the Large Language Model (LLM) for processing infinitely long texts, we propose a novel learning paradigm to achieve UHD multi-exposure dynamic scene image fusion on a single consumer-grade GPU, named Infinite Pixel Learning (IPL). The design of our approach comes from three key components: The first step is to slice the input sequences to relieve the pressure generated by the model processing the data stream; Second, we develop an attention cache technique, which is similar to KV cache for infinite data stream processing; Finally, we design a method for attention cache compression to alleviate the storage burden of the cache on the device. In addition, we provide a new UHD benchmark to evaluate the effectiveness of our method. Extensive experimental results show that our method maintains high-quality visual performance while fusing UHD dynamic multi-exposure images in real-time (>40fps) on a single consumer-grade GPU.
