Table of Contents
Fetching ...

SDAT: Sub-Dataset Alternation Training for Improved Image Demosaicing

Yuval Becker, Raz Z. Nossek, Tomer Peleg

TL;DR

SDAT introduces Sub-Dataset Alternation Training to reduce dataset-induced bias in image demosaicing by alternating learning between biased sub-datasets and the full dataset. The method identifies diverse sub-datasets using artifact metrics and selects training phases based on the minimum average sub-dataset validation loss, formalized as $\bar{V}_t = \frac{1}{N}\sum_{i=1}^{N} V_{w_t,c_i}$. Across both low- and high-capacity architectures, including CNNs and transformers, SDAT yields consistent performance gains and achieves state-of-the-art results on three popular demosaicing benchmarks. The approach emphasizes data-centric training dynamics, demonstrating that curated bias diversity in sub-datasets can enhance generalization, with practical implications for edge devices and broader image restoration tasks.

Abstract

Image demosaicing is an important step in the image processing pipeline for digital cameras. In data centric approaches, such as deep learning, the distribution of the dataset used for training can impose a bias on the networks' outcome. For example, in natural images most patches are smooth, and high-content patches are much rarer. This can lead to a bias in the performance of demosaicing algorithms. Most deep learning approaches address this challenge by utilizing specific losses or designing special network architectures. We propose a novel approach, SDAT, Sub-Dataset Alternation Training, that tackles the problem from a training protocol perspective. SDAT is comprised of two essential phases. In the initial phase, we employ a method to create sub-datasets from the entire dataset, each inducing a distinct bias. The subsequent phase involves an alternating training process, which uses the derived sub-datasets in addition to training also on the entire dataset. SDAT can be applied regardless of the chosen architecture as demonstrated by various experiments we conducted for the demosaicing task. The experiments are performed across a range of architecture sizes and types, namely CNNs and transformers. We show improved performance in all cases. We are also able to achieve state-of-the-art results on three highly popular image demosaicing benchmarks.

SDAT: Sub-Dataset Alternation Training for Improved Image Demosaicing

TL;DR

SDAT introduces Sub-Dataset Alternation Training to reduce dataset-induced bias in image demosaicing by alternating learning between biased sub-datasets and the full dataset. The method identifies diverse sub-datasets using artifact metrics and selects training phases based on the minimum average sub-dataset validation loss, formalized as . Across both low- and high-capacity architectures, including CNNs and transformers, SDAT yields consistent performance gains and achieves state-of-the-art results on three popular demosaicing benchmarks. The approach emphasizes data-centric training dynamics, demonstrating that curated bias diversity in sub-datasets can enhance generalization, with practical implications for edge devices and broader image restoration tasks.

Abstract

Image demosaicing is an important step in the image processing pipeline for digital cameras. In data centric approaches, such as deep learning, the distribution of the dataset used for training can impose a bias on the networks' outcome. For example, in natural images most patches are smooth, and high-content patches are much rarer. This can lead to a bias in the performance of demosaicing algorithms. Most deep learning approaches address this challenge by utilizing specific losses or designing special network architectures. We propose a novel approach, SDAT, Sub-Dataset Alternation Training, that tackles the problem from a training protocol perspective. SDAT is comprised of two essential phases. In the initial phase, we employ a method to create sub-datasets from the entire dataset, each inducing a distinct bias. The subsequent phase involves an alternating training process, which uses the derived sub-datasets in addition to training also on the entire dataset. SDAT can be applied regardless of the chosen architecture as demonstrated by various experiments we conducted for the demosaicing task. The experiments are performed across a range of architecture sizes and types, namely CNNs and transformers. We show improved performance in all cases. We are also able to achieve state-of-the-art results on three highly popular image demosaicing benchmarks.
Paper Structure (10 sections, 2 equations, 5 figures, 5 tables)

This paper contains 10 sections, 2 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: A comparison of PSNR results between our training method (SDAT, blue circles) and a standard training on the entire dataset (EDT, red circles) applied on in-house and various popular architectures xing2022residualli2023efficient over Kodak li2008image dataset. By using our training method we achieved better results compared to standard training across all architectures. In addition, we achieved state-of-the-art (SOTA) results using GRL architecture.
  • Figure 2: Qualitative comparison of our method compared to other top methods: RNAN, RSTCANet, and GRL. The RNAN model has 9M parameters, RSTCANet presents 3 different model sizes: B, S, and L, having 0.9M, 3.1M, and 7.1M parameters respectively and GRL consists of 3.1M parameters. We demonstrate the results of our suggested training scheme to train the RSTCANet-B and GRL models. The RSTCANet-B-SDAT model produces superior qualitative results compared to all original RSTCANet variants while having the least amount of parameters.
  • Figure 3: An illustration of the SDAT method. Each cycle consists of two training phases. The first consists of training over a specific sub-dataset obtained from the pool of collected sub-datasets as explained in \ref{['sub_sec:idn_sub_cat2']}, while the second, consists of training over the entire dataset. Each phase is initialized by the model's weights that achieved the lowest validation loss across all sub-datasets in the previous phase. Every cycle a different sub-dataset is selected. The number of cycles depends on the number of categories and architecture.
  • Figure 4: Depiction of the weight propagation process of SDAT. An illustration of the alternation process between a sub-dataset and the entire dataset. Each graph is a training phase over a single dataset, where the horizontal axis is the training steps and the vertical axis is the average validation loss across all sub-datasets marked with $\Bar{V}_t$. The star on each of the graphs marks the iteration index each training phase stopped accumulating gradients for a specific dataset, as it propagates model weights achieved at that index to the following training phase. Furthermore, there can be training phases that do not aggregate any gradients. As can be seen on the rightmost graph the model could not achieve a convergence that lowers $\Bar{V}_t$, therefore, it accumulated zero training steps.
  • Figure 6: Our two step elimination process for selecting the sub-datasets. (a) shows the first step, where each graph represents a training session over a sub-dataset (the horizontal axis shows the number of epochs, and the vertical axis shows the validation error). The green line represents the validation error of the entire dataset and the red line represents the validation error of the trained sub-dataset. Validation errors of sub-datasets with strong negative correlation convergence compared to the entire dataset validation error, are selected and highlighted in light blue. (b) depicts an example of the second step. The validation error of each sub-dataset (from those chosen in (a)) is compared with the validation error of the other sub-datasets. The objective is to merge the sub-datasets that demonstrate positive correlation convergence, resulting in a similar bias being induced by the trained model. In the example, we demonstrate the comparison of sub-dataset 1's validation (red line) with that of other chosen sub-datasets (green line). The sub-datasets chosen to be merged with other sub-datasets are highlighted in light blue.