Table of Contents
Fetching ...

Efficient Cost-and-Quality Controllable Arbitrary-scale Super-resolution with Fourier Constraints

Kazutoshi Akita, Norimichi Ukita

TL;DR

The paper tackles CQ-controllable arbitrary-scale SR by addressing inefficiencies in one-by-one Fourier component prediction. It introduces a robust joint-prediction framework that outputs multiple Fourier components per recurrence, supported by a Fourier alignment loss to capture dependencies and preserve high-frequency detail. Empirical results on DIV2K show that predicting two components per step ($K=2$) achieves a favorable trade-off, delivering near the quality of the single-component baseline with substantially reduced runtime, while larger $K$ can degrade performance and stability. The approach preserves CQ-controllability and broad scale flexibility, offering practical advantages for resource-constrained and high-demand SR applications, with future work focusing on stability and adaptive component selection.

Abstract

Cost-and-Quality (CQ) controllability in arbitrary-scale super-resolution is crucial. Existing methods predict Fourier components one by one using a recurrent neural network. However, this approach leads to performance degradation and inefficiency due to independent prediction. This paper proposes predicting multiple components jointly to improve both quality and efficiency.

Efficient Cost-and-Quality Controllable Arbitrary-scale Super-resolution with Fourier Constraints

TL;DR

The paper tackles CQ-controllable arbitrary-scale SR by addressing inefficiencies in one-by-one Fourier component prediction. It introduces a robust joint-prediction framework that outputs multiple Fourier components per recurrence, supported by a Fourier alignment loss to capture dependencies and preserve high-frequency detail. Empirical results on DIV2K show that predicting two components per step () achieves a favorable trade-off, delivering near the quality of the single-component baseline with substantially reduced runtime, while larger can degrade performance and stability. The approach preserves CQ-controllability and broad scale flexibility, offering practical advantages for resource-constrained and high-demand SR applications, with future work focusing on stability and adaptive component selection.

Abstract

Cost-and-Quality (CQ) controllability in arbitrary-scale super-resolution is crucial. Existing methods predict Fourier components one by one using a recurrent neural network. However, this approach leads to performance degradation and inefficiency due to independent prediction. This paper proposes predicting multiple components jointly to improve both quality and efficiency.

Paper Structure

This paper contains 16 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Comparison of SR frameworks. (a) Non-CQ-controllable methods reconstruct HR images with a fixed cost–quality trade-off in one trained model. (b) RecurrentLTE achieves CQ controllability by predicting Fourier components one by one through an RNN, but suffers from limited accuracy and inefficiency. (c) The proposed method predicts multiple Fourier components jointly at each recurrence, improving both reconstruction accuracy and efficiency while preserving CQ controllability.
  • Figure 2: Overview of our CQ-controllable arbitrary-scale SR method. Following LIIF liif, a latent code $z$ is extracted from an input LR image using an encoder $E$. The code $z$ is fed into an RNN to predict Fourier components, as in RecurrentLTE our. Unlike our, our method predicts multiple $K$ Fourier components at each recurrence. These predicted components are then used to reconstruct the HR image.
  • Figure 3: Directional bias in Fourier components. In (a) and (b), the left and right images show an image and its amplitude spectrum, respectively. Edge-biased patterns in the spatial domain correspond to aligned Fourier components along specific directions in the frequency domain.
  • Figure 4: Qualitative comparison on DIV2K validation set. Rectangular areas are enlarged for better visualization.
  • Figure 5: PSNR versus runtime for different numbers of jointly predicted components ($K=1,2,3$). Colors denote $K$. Points along each curve correspond to $T=\{60,48,36,24,12\}$from right to left (i.e., runtime decreases as $T$ decreases). With $K=2$, the curve remains close to $K=1$ while shifting left (approximately half the runtime), whereas $K=3$ further reduces runtime at the expense of accuracy.