EatGAN: An Edge-Attention Guided Generative Adversarial Network for Single Image Super-Resolution
Penghao Rao, Tieyong Zeng
TL;DR
EatGAN addresses single-image super-resolution by introducing edge priors into a GAN framework through Normalized Edge Attention (NEA), which combines channel-wise modulation and spatial gating guided by edge information. A Hybrid Edge Residual Block and an edge-gradient loss complement a composite generator objective to enforce structural fidelity and perceptual realism while stabilizing training. Empirical results demonstrate state-of-the-art performance across distortion- and perception-oriented benchmarks, including strong Manga109 gains (40.87 dB PSNR) and robust real-world degradation handling on RealSR and KonIQ datasets, with favorable computational efficiency. The work highlights that reframing edge priors as controllable modulation primitives enables trustworthy, high-fidelity SR with practical deployment potential. The combination of explicit and implicit edge guidance, coupled with a carefully designed loss, yields a principled path toward high-quality SR in diverse settings.
Abstract
Single-image super-resolution (SISR) is an important task in image processing, aiming to enhance the resolution of imaging systems. Recently, SISR has made a significant leap and achieved promising results with deep learning. GAN-based models stand out among all the deep learning models because of their excellent performance in perceiving quality. However, it is rather difficult for them to reconstruct realistic high-frequency details and achieve stable training. To solve these issues, we introduce an Edge-Attention guided Generative Adversarial Network (EatGAN), the first GAN-based SISR model that simultaneously leverages edge priors both explicitly and implicitly inside the generator, which (i) proposes a Normalized Edge Attention (NEA) mechanism based on channel-affine and spatial gating that transforms edge prior into lightweight, learnable modulation parameters and injects and fuses them multiple times in a (ii) edge-guided hybrid residual block, which progressively enforces structural consistency across scales; and (iii) a composite generator objective combining pixel, perceptual, edge-gradient, and adversarial terms. Experiments show consistent state-of-the-art across distortion-oriented benchmarks and perception oriented benchmarks. Notably, our model achieves 40.87 dB and 0.073 (LPIPS) on Manga 109, which indicates that reframing image priors from passive guidance into a controllable modulation primitive for generators can chart a practical path toward trustworthy, high-fidelity Super-Resolution.
