Table of Contents
Fetching ...

Polarization Uncertainty-Guided Diffusion Model for Color Polarization Image Demosaicking

Chenggong Li, Yidong Luo, Junchao Zhang, Degui Yang

TL;DR

The image diffusion prior from text-to-image (T2I) models is introduced to overcome the performance bottleneck of network-based methods, with the additional diffusion prior compensating for limited representational capacity caused by restricted data distribution.

Abstract

Color polarization demosaicking (CPDM) aims to reconstruct full-resolution polarization images of four directions from the color-polarization filter array (CPFA) raw image. Due to the challenge of predicting numerous missing pixels and the scarcity of high-quality training data, existing network-based methods, despite effectively recovering scene intensity information, still exhibit significant errors in reconstructing polarization characteristics (degree of polarization, DOP, and angle of polarization, AOP). To address this problem, we introduce the image diffusion prior from text-to-image (T2I) models to overcome the performance bottleneck of network-based methods, with the additional diffusion prior compensating for limited representational capacity caused by restricted data distribution. To effectively leverage the diffusion prior, we explicitly model the polarization uncertainty during reconstruction and use uncertainty to guide the diffusion model in recovering high error regions. Extensive experiments demonstrate that the proposed method accurately recovers scene polarization characteristics with both high fidelity and strong visual perception.

Polarization Uncertainty-Guided Diffusion Model for Color Polarization Image Demosaicking

TL;DR

The image diffusion prior from text-to-image (T2I) models is introduced to overcome the performance bottleneck of network-based methods, with the additional diffusion prior compensating for limited representational capacity caused by restricted data distribution.

Abstract

Color polarization demosaicking (CPDM) aims to reconstruct full-resolution polarization images of four directions from the color-polarization filter array (CPFA) raw image. Due to the challenge of predicting numerous missing pixels and the scarcity of high-quality training data, existing network-based methods, despite effectively recovering scene intensity information, still exhibit significant errors in reconstructing polarization characteristics (degree of polarization, DOP, and angle of polarization, AOP). To address this problem, we introduce the image diffusion prior from text-to-image (T2I) models to overcome the performance bottleneck of network-based methods, with the additional diffusion prior compensating for limited representational capacity caused by restricted data distribution. To effectively leverage the diffusion prior, we explicitly model the polarization uncertainty during reconstruction and use uncertainty to guide the diffusion model in recovering high error regions. Extensive experiments demonstrate that the proposed method accurately recovers scene polarization characteristics with both high fidelity and strong visual perception.
Paper Structure (13 sections, 18 equations, 7 figures, 3 tables)

This paper contains 13 sections, 18 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The workflow of color polarization imaging. The second row presents the AOP results of different CPDM methods, where AOP reflects the accuracy of polarization property reconstruction. Our method exhibits the least noise interference and achieves the most visually faithful results.
  • Figure 2: The framework of the proposed PUGDiff. (a) The architecture of the base branch, the mosaic array is initialized as a low-quality color polarization image and fed into the base branch for processing, and the network is updated via $\mathcal{L}_B$; (b) The architecture of the SD branch, results of the base branch is further refined by the SD branch. LoRA layers are injected into SD to adapt it to the CPDM task, while the cross-attention modules are pruned to improve efficiency, the LoRA parameters are updated via $\mathcal{L}_{SD}$; (c) The fusion process of the two branches, which is also used during inference. The contribution of each branch is adaptively weighted by polarization uncertainty. In low-uncertainty regions, the base branch is favored for high fidelity; in high-uncertainty regions, the SD branch is used to enhance the polarization property.
  • Figure 3: The architecture of the uncertainty estimation network. The network's backbone shares the same architecture as the base branch, but augmented with an additional estimation head to output polarization uncertainty. The entire network is updated via $\mathcal{L}_{PU}$, and during the training phase in Fig. \ref{['fig:2']} (c), only the polarization uncertainty is used. The right part of the figure shows the reconstructed DOP and the polarization uncertainty. High values in the heatmap indicate regions where DOP reconstruction is inaccurate.
  • Figure 4: Visual comparisons for CPDM of different methods. Both on synthetic and real-captured images, our method achieves the most outstanding polarization reconstruction performance in terms of AOP and DOP.
  • Figure 5: Visualization of results from different branches. Our method fuses the two branches to achieve complementary strengths: using the SD branch in high-uncertainty regions to improve reconstruction, and the base branch in low-uncertainty regions to preserve fidelity.
  • ...and 2 more figures