Table of Contents
Fetching ...

EatGAN: An Edge-Attention Guided Generative Adversarial Network for Single Image Super-Resolution

Penghao Rao, Tieyong Zeng

TL;DR

EatGAN addresses single-image super-resolution by introducing edge priors into a GAN framework through Normalized Edge Attention (NEA), which combines channel-wise modulation and spatial gating guided by edge information. A Hybrid Edge Residual Block and an edge-gradient loss complement a composite generator objective to enforce structural fidelity and perceptual realism while stabilizing training. Empirical results demonstrate state-of-the-art performance across distortion- and perception-oriented benchmarks, including strong Manga109 gains (40.87 dB PSNR) and robust real-world degradation handling on RealSR and KonIQ datasets, with favorable computational efficiency. The work highlights that reframing edge priors as controllable modulation primitives enables trustworthy, high-fidelity SR with practical deployment potential. The combination of explicit and implicit edge guidance, coupled with a carefully designed loss, yields a principled path toward high-quality SR in diverse settings.

Abstract

Single-image super-resolution (SISR) is an important task in image processing, aiming to enhance the resolution of imaging systems. Recently, SISR has made a significant leap and achieved promising results with deep learning. GAN-based models stand out among all the deep learning models because of their excellent performance in perceiving quality. However, it is rather difficult for them to reconstruct realistic high-frequency details and achieve stable training. To solve these issues, we introduce an Edge-Attention guided Generative Adversarial Network (EatGAN), the first GAN-based SISR model that simultaneously leverages edge priors both explicitly and implicitly inside the generator, which (i) proposes a Normalized Edge Attention (NEA) mechanism based on channel-affine and spatial gating that transforms edge prior into lightweight, learnable modulation parameters and injects and fuses them multiple times in a (ii) edge-guided hybrid residual block, which progressively enforces structural consistency across scales; and (iii) a composite generator objective combining pixel, perceptual, edge-gradient, and adversarial terms. Experiments show consistent state-of-the-art across distortion-oriented benchmarks and perception oriented benchmarks. Notably, our model achieves 40.87 dB and 0.073 (LPIPS) on Manga 109, which indicates that reframing image priors from passive guidance into a controllable modulation primitive for generators can chart a practical path toward trustworthy, high-fidelity Super-Resolution.

EatGAN: An Edge-Attention Guided Generative Adversarial Network for Single Image Super-Resolution

TL;DR

EatGAN addresses single-image super-resolution by introducing edge priors into a GAN framework through Normalized Edge Attention (NEA), which combines channel-wise modulation and spatial gating guided by edge information. A Hybrid Edge Residual Block and an edge-gradient loss complement a composite generator objective to enforce structural fidelity and perceptual realism while stabilizing training. Empirical results demonstrate state-of-the-art performance across distortion- and perception-oriented benchmarks, including strong Manga109 gains (40.87 dB PSNR) and robust real-world degradation handling on RealSR and KonIQ datasets, with favorable computational efficiency. The work highlights that reframing edge priors as controllable modulation primitives enables trustworthy, high-fidelity SR with practical deployment potential. The combination of explicit and implicit edge guidance, coupled with a carefully designed loss, yields a principled path toward high-quality SR in diverse settings.

Abstract

Single-image super-resolution (SISR) is an important task in image processing, aiming to enhance the resolution of imaging systems. Recently, SISR has made a significant leap and achieved promising results with deep learning. GAN-based models stand out among all the deep learning models because of their excellent performance in perceiving quality. However, it is rather difficult for them to reconstruct realistic high-frequency details and achieve stable training. To solve these issues, we introduce an Edge-Attention guided Generative Adversarial Network (EatGAN), the first GAN-based SISR model that simultaneously leverages edge priors both explicitly and implicitly inside the generator, which (i) proposes a Normalized Edge Attention (NEA) mechanism based on channel-affine and spatial gating that transforms edge prior into lightweight, learnable modulation parameters and injects and fuses them multiple times in a (ii) edge-guided hybrid residual block, which progressively enforces structural consistency across scales; and (iii) a composite generator objective combining pixel, perceptual, edge-gradient, and adversarial terms. Experiments show consistent state-of-the-art across distortion-oriented benchmarks and perception oriented benchmarks. Notably, our model achieves 40.87 dB and 0.073 (LPIPS) on Manga 109, which indicates that reframing image priors from passive guidance into a controllable modulation primitive for generators can chart a practical path toward trustworthy, high-fidelity Super-Resolution.

Paper Structure

This paper contains 55 sections, 18 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: Given a single LR image, A deep neural network $f_\theta$, which has been well trained on a dataset to learn the mapping from LR images to their corresponding HR versions, generates its HR reconstruction with enhanced visual quality and sharper details.
  • Figure 2: Overview of EatGAN Architecture. Top: The generator takes LR images and extracted edge maps as inputs, processes them through hybrid edge residual blocks with Normalized Edge Attention (NEA) mechanisms, and generates SR images. Bottom: The discriminator consists of 8 convolution blocks, global pooling, and fully connected layers for fake (0.0) and real (1.0) classification.
  • Figure 3: Training stability analysis comparing EatGAN with SRGAN, ESRGAN, and RealESRGAN. (a) Generator loss convergence. (b) PSNR on Urban100 (×4). (c) Loss variance (lower is better). (d) Composite loss components. (e) Convergence speed vs. stability. (f) Two-stage training: pre-training and fine-tuning.
  • Figure 4: Visual comparisons of our model and other SOTA models for 4$\times$ upscale SR on the Set5 dataset.
  • Figure 5: Complexity-Performance analysis on Urban100 (×4). (a) PSNR vs. FLOPs. (b) LPIPS vs.FLOPs. The red dashed line indicates the Pareto frontier. EatGAN outperforms.
  • ...and 7 more figures