Table of Contents
Fetching ...

Rethinking Image Super-Resolution from Training Data Perspectives

Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki

TL;DR

This work investigates how training data—specifically resolution, quality, and diversity—impacts image super-resolution performance. It introduces DiverSeg, an automated, low-resolution yet high-quality SR dataset constructed via source selection and object-based filtering, and demonstrates that SR models trained on DiverSeg can outperform those trained on large high-resolution datasets. Key findings show that minimizing compression artifacts, maximizing within-image object diversity, and leveraging ImageNet- or PASS-derived content consistently boost SR gains, sometimes surpassing traditional high-resolution regimes. The study offers a practical, scalable dataset curation pipeline that can guide future SR dataset construction and model development, with potential for more efficient and robust SR systems.

Abstract

In this work, we investigate the understudied effect of the training data used for image super-resolution (SR). Most commonly, novel SR methods are developed and benchmarked on common training datasets such as DIV2K and DF2K. However, we investigate and rethink the training data from the perspectives of diversity and quality, {thereby addressing the question of ``How important is SR training for SR models?''}. To this end, we propose an automated image evaluation pipeline. With this, we stratify existing high-resolution image datasets and larger-scale image datasets such as ImageNet and PASS to compare their performances. We find that datasets with (i) low compression artifacts, (ii) high within-image diversity as judged by the number of different objects, and (iii) a large number of images from ImageNet or PASS all positively affect SR performance. We hope that the proposed simple-yet-effective dataset curation pipeline will inform the construction of SR datasets in the future and yield overall better models.

Rethinking Image Super-Resolution from Training Data Perspectives

TL;DR

This work investigates how training data—specifically resolution, quality, and diversity—impacts image super-resolution performance. It introduces DiverSeg, an automated, low-resolution yet high-quality SR dataset constructed via source selection and object-based filtering, and demonstrates that SR models trained on DiverSeg can outperform those trained on large high-resolution datasets. Key findings show that minimizing compression artifacts, maximizing within-image object diversity, and leveraging ImageNet- or PASS-derived content consistently boost SR gains, sometimes surpassing traditional high-resolution regimes. The study offers a practical, scalable dataset curation pipeline that can guide future SR dataset construction and model development, with potential for more efficient and robust SR systems.

Abstract

In this work, we investigate the understudied effect of the training data used for image super-resolution (SR). Most commonly, novel SR methods are developed and benchmarked on common training datasets such as DIV2K and DF2K. However, we investigate and rethink the training data from the perspectives of diversity and quality, {thereby addressing the question of ``How important is SR training for SR models?''}. To this end, we propose an automated image evaluation pipeline. With this, we stratify existing high-resolution image datasets and larger-scale image datasets such as ImageNet and PASS to compare their performances. We find that datasets with (i) low compression artifacts, (ii) high within-image diversity as judged by the number of different objects, and (iii) a large number of images from ImageNet or PASS all positively affect SR performance. We hope that the proposed simple-yet-effective dataset curation pipeline will inform the construction of SR datasets in the future and yield overall better models.
Paper Structure (16 sections, 3 equations, 8 figures, 8 tables)

This paper contains 16 sections, 3 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: We propose an automated image evaluation pipeline to curate a dataset for training SR models. The obtained dataset, namely DiverSeg, consists of low-resolution but high-quality images with many object regions. SR models trained on DiverSeg outperform those trained on high-resolution image datasets such as DF2K and LSDIR.
  • Figure 2: Images of DiverSeg with their segmentation masks. DiverSeg is obtained from a large set of low-resolution images through the automated image evaluation pipeline.
  • Figure 3: Comparison of image degradation due to JPEG quality(blue). Blockiness values calculated from the images are marked. As the JPEG quality decreases and artifacts increase, we observe a corresponding rise in blockiness values.
  • Figure 4: (a) Blockiness distributions $p_{X, 1.0}$ for $X = \text{ImageNet-1k}, \text{Places365}$ and $\text{PASS}$. (b) Basis distributions $p_{Z, q}$ for $Z=\text{DF2K}$ and $q = 0.5, 0.75, 0.85, 0.95, 1.0$. We estimate the quality by comparing $p_{X,1.0}$ and $p_{Z,q}$ using the KL divergence.
  • Figure 5: Comparison of learning processes obtained with various JPEG quality values.
  • ...and 3 more figures