Table of Contents
Fetching ...

Perceptual-Distortion Balanced Image Super-Resolution is a Multi-Objective Optimization Problem

Qiwen Zhu, Yanjie Wang, Shilv Cai, Liqun Chen, Jiahuan Zhou, Luxin Yan, Sheng Zhong, Xu Zou

TL;DR

A novel approach to single-image super-resolution (SISR) that balances perceptual quality and distortion through multi-objective optimization (MOO) and dynamically adjusts loss weights during training, which reduces the need for manual hyperparameter tuning and lessens computational demands compared to AutoML.

Abstract

Training Single-Image Super-Resolution (SISR) models using pixel-based regression losses can achieve high distortion metrics scores (e.g., PSNR and SSIM), but often results in blurry images due to insufficient recovery of high-frequency details. Conversely, using GAN or perceptual losses can produce sharp images with high perceptual metric scores (e.g., LPIPS), but may introduce artifacts and incorrect textures. Balancing these two types of losses can help achieve a trade-off between distortion and perception, but the challenge lies in tuning the loss function weights. To address this issue, we propose a novel method that incorporates Multi-Objective Optimization (MOO) into the training process of SISR models to balance perceptual quality and distortion. We conceptualize the relationship between loss weights and image quality assessment (IQA) metrics as black-box objective functions to be optimized within our Multi-Objective Bayesian Optimization Super-Resolution (MOBOSR) framework. This approach automates the hyperparameter tuning process, reduces overall computational cost, and enables the use of numerous loss functions simultaneously. Extensive experiments demonstrate that MOBOSR outperforms state-of-the-art methods in terms of both perceptual quality and distortion, significantly advancing the perception-distortion Pareto frontier. Our work points towards a new direction for future research on balancing perceptual quality and fidelity in nearly all image restoration tasks. The source code and pretrained models are available at: https://github.com/ZhuKeven/MOBOSR.

Perceptual-Distortion Balanced Image Super-Resolution is a Multi-Objective Optimization Problem

TL;DR

A novel approach to single-image super-resolution (SISR) that balances perceptual quality and distortion through multi-objective optimization (MOO) and dynamically adjusts loss weights during training, which reduces the need for manual hyperparameter tuning and lessens computational demands compared to AutoML.

Abstract

Training Single-Image Super-Resolution (SISR) models using pixel-based regression losses can achieve high distortion metrics scores (e.g., PSNR and SSIM), but often results in blurry images due to insufficient recovery of high-frequency details. Conversely, using GAN or perceptual losses can produce sharp images with high perceptual metric scores (e.g., LPIPS), but may introduce artifacts and incorrect textures. Balancing these two types of losses can help achieve a trade-off between distortion and perception, but the challenge lies in tuning the loss function weights. To address this issue, we propose a novel method that incorporates Multi-Objective Optimization (MOO) into the training process of SISR models to balance perceptual quality and distortion. We conceptualize the relationship between loss weights and image quality assessment (IQA) metrics as black-box objective functions to be optimized within our Multi-Objective Bayesian Optimization Super-Resolution (MOBOSR) framework. This approach automates the hyperparameter tuning process, reduces overall computational cost, and enables the use of numerous loss functions simultaneously. Extensive experiments demonstrate that MOBOSR outperforms state-of-the-art methods in terms of both perceptual quality and distortion, significantly advancing the perception-distortion Pareto frontier. Our work points towards a new direction for future research on balancing perceptual quality and fidelity in nearly all image restoration tasks. The source code and pretrained models are available at: https://github.com/ZhuKeven/MOBOSR.
Paper Structure (19 sections, 13 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 13 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Relative contributions of each loss function to the final model metrics. The definitions of these loss functions are detailed in Section \ref{['sec:impl_detail']}. The relative contribution are calculated by training a SR model with each individual loss function and evaluating them on the DIV2K Agustsson17 validation set. Metrics are normalized, so a higher bar indicates better performance for the corresponding metric. This indicates there is an inherent conflict between perceptual and regression losses (e.g. [$\mathcal{L}_{1}$, $\mathcal{L}_{2}$, $\mathcal{L}_{FFT}$, $\mathcal{L}_{\nabla}$, $\mathcal{L}_{SSIM}$] contribute more to distortion metrics (PSNR, SSIM Zhou04, and LR-PSNR), while [$\mathcal{L}_{LPIPS}$, $\mathcal{L}_{\phi_{2}}$, $\mathcal{L}_{\phi_{3}}$, $\mathcal{L}_{\phi_{4}}$, $\mathcal{L}_{\phi_{5}}$] contribute more to perceptual metric (LPIPS Zhang18)).
  • Figure 2: A toy example that demonstrates the 2-dimensional Pareto frontier, along with the HV and HVI.
  • Figure 3: Visual comparison of three sampled points on the Pareto frontier obtained through our method (as defined in Figure \ref{['fig:teaser']}), alongside other artworks, on the Urban100 Huang15 dataset. More visual results are presented in the supplementary material.