Table of Contents
Fetching ...

Semi-Supervised High Dynamic Range Image Reconstructing via Bi-Level Uncertain Area Masking

Wei Jiang, Jiahao Cui, Yizheng Wu, Zhan Peng, Zhiyu Pan, Zhiguo Cao

TL;DR

The paper tackles HDR reconstruction from LDR bursts when HDR ground-truth data are scarce. It introduces a semi-supervised teacher–student framework in which an EMA-updated teacher generates pseudo HDR GTs, and a judge network estimates pixelwise uncertainty to guide a bi-level masking of pseudo labels. A dedicated uncertainty loss $\mathcal{L}^k$ and strong data augmentations enable reliable learning from unlabeled data, yielding reliable pseudo GTs at both patch and pixel levels. Empirical results on Kalantari and Hu datasets show state-of-the-art performance under a 6.7% HDR GT regime, approaching fully supervised methods with significantly reduced annotation burden.

Abstract

Reconstructing high dynamic range (HDR) images from low dynamic range (LDR) bursts plays an essential role in the computational photography. Impressive progress has been achieved by learning-based algorithms which require LDR-HDR image pairs. However, these pairs are hard to obtain, which motivates researchers to delve into the problem of annotation-efficient HDR image reconstructing: how to achieve comparable performance with limited HDR ground truths (GTs). This work attempts to address this problem from the view of semi-supervised learning where a teacher model generates pseudo HDR GTs for the LDR samples without GTs and a student model learns from pseudo GTs. Nevertheless, the confirmation bias, i.e., the student may learn from the artifacts in pseudo HDR GTs, presents an impediment. To remove this impediment, an uncertainty-based masking process is proposed to discard unreliable parts of pseudo GTs at both pixel and patch levels, then the trusted areas can be learned from by the student. With this novel masking process, our semi-supervised HDR reconstructing method not only outperforms previous annotation-efficient algorithms, but also achieves comparable performance with up-to-date fully-supervised methods by using only 6.7% HDR GTs.

Semi-Supervised High Dynamic Range Image Reconstructing via Bi-Level Uncertain Area Masking

TL;DR

The paper tackles HDR reconstruction from LDR bursts when HDR ground-truth data are scarce. It introduces a semi-supervised teacher–student framework in which an EMA-updated teacher generates pseudo HDR GTs, and a judge network estimates pixelwise uncertainty to guide a bi-level masking of pseudo labels. A dedicated uncertainty loss and strong data augmentations enable reliable learning from unlabeled data, yielding reliable pseudo GTs at both patch and pixel levels. Empirical results on Kalantari and Hu datasets show state-of-the-art performance under a 6.7% HDR GT regime, approaching fully supervised methods with significantly reduced annotation burden.

Abstract

Reconstructing high dynamic range (HDR) images from low dynamic range (LDR) bursts plays an essential role in the computational photography. Impressive progress has been achieved by learning-based algorithms which require LDR-HDR image pairs. However, these pairs are hard to obtain, which motivates researchers to delve into the problem of annotation-efficient HDR image reconstructing: how to achieve comparable performance with limited HDR ground truths (GTs). This work attempts to address this problem from the view of semi-supervised learning where a teacher model generates pseudo HDR GTs for the LDR samples without GTs and a student model learns from pseudo GTs. Nevertheless, the confirmation bias, i.e., the student may learn from the artifacts in pseudo HDR GTs, presents an impediment. To remove this impediment, an uncertainty-based masking process is proposed to discard unreliable parts of pseudo GTs at both pixel and patch levels, then the trusted areas can be learned from by the student. With this novel masking process, our semi-supervised HDR reconstructing method not only outperforms previous annotation-efficient algorithms, but also achieves comparable performance with up-to-date fully-supervised methods by using only 6.7% HDR GTs.

Paper Structure

This paper contains 18 sections, 12 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Qualitative comparison with prior arts. The illustrated example shows that, trained with only $6.7\%$ GTs, our method outperforms previous annotation-efficient methods and achieve comparable or even better qualitative performance compared with methods trained with all GTs.
  • Figure 2: The pipeline of the proposed annotation-efficient HDR image reconstructing framework. The framework follows the teacher-student structure. The teacher branch predicts both the pseudo HDR GTs and the corresponding uncertainty maps. The uncertainty map evaluates the pixel reliability of pseudo GTs. Thus, we can discard the uncertain regions of the pseudo HDR GTs. In this way, the student can hardly be affected by the teacher's mistakes. We mask out the uncertain regions of the pseudo GTs from both the patch- and pixel-level. The student model learns from the masked pseudo GTs with the unsupervised loss ($\mathcal{L}_u$). The student also learns from the real GTs with the supervised loss ($\mathcal{L}_s$). Then, the teacher model is updated by student model via exponential moving average (EMA) tarvainen2017mean.
  • Figure 3: The architecture of the judge network. The input LDR images are first fed to the attention-based feature extraction network liu2022ghost to obtain the fused feature $F_{att}$. Then the three $3\times3$ convolution layers with a skip addition connection generate the uncertainty map.
  • Figure 4: Examples of qualitative results on dataset with GT. Subfigure (a) and (b) present the examples of Kalantari’s kalantari2017deep and Hu’s hu2013hdr datasets, respectively.
  • Figure 5: Qualitative comparison of pseudo HDR GT masking strategies. Boxed and zoomed regions highlight notable artifacts in pseudo GTs. Columns 3–6 show APSS uncertainty maps in SMAE and our method at pixel and patch levels. To enhance visual perception, the heat maps are normalized, higher colorbar values indicate more artifacts in pseudo GTs.