Table of Contents
Fetching ...

InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution

Shoukun Sun, Zhe Wang, Xiang Que, Jiyin Zhang, Xiaogang Ma

TL;DR

This paper adapts the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism, allowing for the seamless generation of large-scale high-resolution imagery without retraining.

Abstract

Image Super-Resolution (SR) aims to recover high-resolution (HR) details from low-resolution (LR) inputs, a task where Denoising Diffusion Probabilistic Models (DDPMs) have recently shown superior performance compared to Generative Adversarial Networks (GANs) based approaches. However, standard diffusion-based SR models, such as SR3, are typically trained on fixed-size patches and struggle to scale to arbitrary-sized images due to memory constraints. Applying these models via independent patch processing leads to visible seams and inconsistent textures across boundaries. In this paper, we propose InfScene-SR, a framework enabling spatially continuous super-resolution for large, arbitrary scenes. We adapt the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism, allowing for the seamless generation of large-scale high-resolution imagery without retraining. We validate our approach on remote sensing datasets, demonstrating that InfScene-SR not only reconstructs fine details with high perceptual quality but also eliminates boundary artifacts, benefiting downstream tasks such as semantic segmentation.

InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution

TL;DR

This paper adapts the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism, allowing for the seamless generation of large-scale high-resolution imagery without retraining.

Abstract

Image Super-Resolution (SR) aims to recover high-resolution (HR) details from low-resolution (LR) inputs, a task where Denoising Diffusion Probabilistic Models (DDPMs) have recently shown superior performance compared to Generative Adversarial Networks (GANs) based approaches. However, standard diffusion-based SR models, such as SR3, are typically trained on fixed-size patches and struggle to scale to arbitrary-sized images due to memory constraints. Applying these models via independent patch processing leads to visible seams and inconsistent textures across boundaries. In this paper, we propose InfScene-SR, a framework enabling spatially continuous super-resolution for large, arbitrary scenes. We adapt the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism, allowing for the seamless generation of large-scale high-resolution imagery without retraining. We validate our approach on remote sensing datasets, demonstrating that InfScene-SR not only reconstructs fine details with high perceptual quality but also eliminates boundary artifacts, benefiting downstream tasks such as semantic segmentation.
Paper Structure (28 sections, 15 equations, 6 figures, 3 tables)

This paper contains 28 sections, 15 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Visual comparison of super-resolution methods. The proposed InfScene-SR reconstruction eliminates the grid artifacts visible in the standard SR3 output while recovering fine details superior to bicubic interpolation.
  • Figure 2:
  • Figure 3:
  • Figure 4: Geographic extent of the study area. The dataset covers 15 coastal counties in California using 2024 NAIP imagery, with Santa Barbara County (highlighted) reserved for testing.
  • Figure 5: Downstream task overview: Semantic segmentation of the invasive Carpobrotus edulis (Iceplant). The goal is to delineate Iceplant coverage from aerial imagery, comparing performance across original and super-resolved inputs.
  • ...and 1 more figures