Table of Contents
Fetching ...

SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW

Masakazu Yoshimura, Junji Otsuka, Radu Berdan, Takeshi Ohashi

TL;DR

This work tackles the data bottleneck in training DNN-based ISP and IE systems by introducing a semi-supervised framework that exploits unlabeled data through a one-to-many sRGB-to-RAW mapping. By training an ISP in the RAW-to-sRGB direction and using its inverse with multiple parameter sets, the method generates diverse pseudo-RAW images, while a quality-updating step aligns pseudo-ground-truth with a small labeled set. An online loss-based pseudo-data filtering mechanism selects informative pseudo-samples, and the approach extends to IE with a simple inverse tone-mapping modification. Across diverse datasets, SemiISP and SemiIE yield substantial improvements, especially with limited labeled data, and enable personalized image quality for different use cases.

Abstract

DNN-based methods have been successful in Image Signal Processor (ISP) and image enhancement (IE) tasks. However, the cost of creating training data for these tasks is considerably higher than for other tasks, making it difficult to prepare large-scale datasets. Also, creating personalized ISP and IE with minimal training data can lead to new value streams since preferred image quality varies depending on the person and use case. While semi-supervised learning could be a potential solution in such cases, it has rarely been utilized for these tasks. In this paper, we realize semi-supervised learning for ISP and IE leveraging a RAW image reconstruction (sRGB-to-RAW) method. Although existing sRGB-to-RAW methods can generate pseudo-RAW image datasets that improve the accuracy of RAW-based high-level computer vision tasks such as object detection, their quality is not sufficient for ISP and IE tasks that require precise image quality definition. Therefore, we also propose a sRGB-to-RAW method that can improve the image quality of these tasks. The proposed semi-supervised learning with the proposed sRGB-to-RAW method successfully improves the image quality of various models on various datasets.

SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW

TL;DR

This work tackles the data bottleneck in training DNN-based ISP and IE systems by introducing a semi-supervised framework that exploits unlabeled data through a one-to-many sRGB-to-RAW mapping. By training an ISP in the RAW-to-sRGB direction and using its inverse with multiple parameter sets, the method generates diverse pseudo-RAW images, while a quality-updating step aligns pseudo-ground-truth with a small labeled set. An online loss-based pseudo-data filtering mechanism selects informative pseudo-samples, and the approach extends to IE with a simple inverse tone-mapping modification. Across diverse datasets, SemiISP and SemiIE yield substantial improvements, especially with limited labeled data, and enable personalized image quality for different use cases.

Abstract

DNN-based methods have been successful in Image Signal Processor (ISP) and image enhancement (IE) tasks. However, the cost of creating training data for these tasks is considerably higher than for other tasks, making it difficult to prepare large-scale datasets. Also, creating personalized ISP and IE with minimal training data can lead to new value streams since preferred image quality varies depending on the person and use case. While semi-supervised learning could be a potential solution in such cases, it has rarely been utilized for these tasks. In this paper, we realize semi-supervised learning for ISP and IE leveraging a RAW image reconstruction (sRGB-to-RAW) method. Although existing sRGB-to-RAW methods can generate pseudo-RAW image datasets that improve the accuracy of RAW-based high-level computer vision tasks such as object detection, their quality is not sufficient for ISP and IE tasks that require precise image quality definition. Therefore, we also propose a sRGB-to-RAW method that can improve the image quality of these tasks. The proposed semi-supervised learning with the proposed sRGB-to-RAW method successfully improves the image quality of various models on various datasets.

Paper Structure

This paper contains 26 sections, 22 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Overview of our proposed semi-supervised learning for ISP. We need a small number of RAW and retouched sRGB image pairs with the desired image quality. Our method generates pseudo-data using a proposed sRGB-to-RAW and a proposed method updating sRGB quality from normal quality sRGB images and improves the ISP quality with the proposed semi-supervised learning.
  • Figure 2: (a) The ISP cancels out ambient light, resulting in a many-to-one mapping. (b) Many DNN-based sRGB-to-RAW methods deterministically restore RAW images, generating realistic but biased data towards for example normal environments. (c) UPI brooks2019unprocessing achieves one-to-many mapping by randomly assigning hyperparameters of a simple inverse ISP, but some of the outputs are unrealistic. (d) We achieve realistic quality and distribution one-to-many mapping by using the parameter set $P$ used in a high-performance ISP in its inverse function.
  • Figure 3: (a) In the training phase of our sRGB-to-RAW, we train the ISP direction, especially the DNN part to generate ISP parameters $P$. (b) In the pseudo-data generation phase, we apply our sRGB-to-RAW and ISP to update the quality of the general sRGB images similar to that of the real sRGB images. Then, we further apply our sRGB-to-RAW to generate realistic pseudo-RAW images.
  • Figure 4: The left is the visual comparison of the sRGB-to-RAW methods on FiveK tone mapping task with 100 training samples. Three variations are shown for non-deterministic sRGB-to-RAW methods. Ours* is used when there is sufficient training data to cover a slightly wider domain than real RAW images. On the right is an example where normal and various quality sRGB images from the COCO dataset are converted to sRGB images in the style of FiveK's ground truth images using the proposed method.
  • Figure 5: The left figure shows the results of changing the ratio of real data $N_r$ to pseudo-data $N_p$ batch sizes during semi-supervised learning in the FiveK tone mapping task. The right figure shows the results of changing the threshold $\beta$.
  • ...and 3 more figures