Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances

Brian Moser; Federico Raue; Stanislav Frolov; Jörn Hees; Sebastian Palacio; Andreas Dengel

Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances

Brian Moser, Federico Raue, Stanislav Frolov, Jörn Hees, Sebastian Palacio, Andreas Dengel

TL;DR

This work surveys deep-learning–driven super-resolution, highlighting the ill-posed nature of SR and the need for robust, flexible upsampling and evaluation. It surveys learning objectives (pixel, content, uncertainty losses), adversarial and diffusion-based approaches, and a broad spectrum of SR models from simple CNNs to transformer- and wavelet-based architectures, including unsupervised and NAS-driven methods. Key contributions include integration of uncertainty-driven losses, diffusion models, normalization advances, wavelet networks, and NAS, as well as visualizations to map architectural trends. The review identifies critical gaps and promising directions for future research, emphasizing practical evaluation, real-world degradations, and efficient, scalable SR systems with flexible upsampling capabilities. Overall, the paper serves as a comprehensive guide for researchers aiming to push the boundaries of DL-based SR and its deployment in diverse domains.

Abstract

With the advent of Deep Learning (DL), Super-Resolution (SR) has also become a thriving research area. However, despite promising results, the field still faces challenges that require further research e.g., allowing flexible upsampling, more effective loss functions, and better evaluation metrics. We review the domain of SR in light of recent advances, and examine state-of-the-art models such as diffusion (DDPM) and transformer-based SR models. We present a critical discussion on contemporary strategies used in SR, and identify promising yet unexplored research directions. We complement previous surveys by incorporating the latest developments in the field such as uncertainty-driven losses, wavelet networks, neural architecture search, novel normalization methods, and the latests evaluation techniques. We also include several visualizations for the models and methods throughout each chapter in order to facilitate a global understanding of the trends in the field. This review is ultimately aimed at helping researchers to push the boundaries of DL applied to SR.

Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances

TL;DR

Abstract

Paper Structure (59 sections, 40 equations, 38 figures, 4 tables)

This paper contains 59 sections, 40 equations, 38 figures, 4 tables.

Introduction
Introduction
Setting and Terminology
Problem Definition: Super-Resolution
Single Image Super-Resolution (SISR)
Multi-Image Super-Resolution (MISR)
Evaluation: Image Quality Assessment (IQA)
Mean Opinion Score (MOS)
Peak Signal-to-Noise Ratio (PSNR)
Structural Similarity Index (SSIM)
Learning-based Perceptual Quality (LPQ)
Task-based Evaluation (TBE)
Evaluation with defined Features
Multi-Scale Evaluation
Datasets and Challenges
...and 44 more sections

Figures (38)

Figure 1: Example images from different SR datasets: Set5 bevilacqua2012low, Set14 zeyde2010single, Manga109 matsui2017sketch, General100 dong2016accelerating, BSDS100 martin2001database, and BSDS200 martin2001database. The ratio of the size differences is preserved.
Figure 2: Principle of DDPMs. The Gaussian diffusion process adds noise iteratively. The iterative refinement process reverts the process. The task of the SR model is to predict the noise added between two iterations. The predicted noise is then used to revert one iteration.
Figure 3: Channel-attention mechanism hu2018squeeze. It reduces a feature map in the spatial dimensions and extracts weighting values by using several FC layers that are element-wise multiplied to the initial feature map.
Figure 4: Spatial-attention mechanism wang2018non. It extracts informations by inspecting the relationship between two positions (first matrix multiplication) and returns the importance of each position as a feature map (second matrix multiplication). MM denotes the matrix multiplication.
Figure 5: Visualization of different upsampling locations within a neural network: (upper left) post-upsampling, (bottom left) pre-upsampling, (upper right) progressive upsampling, and (bottom right) iterative up-and-down upsampling.
...and 33 more figures

Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances

TL;DR

Abstract

Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances

Authors

TL;DR

Abstract

Table of Contents

Figures (38)