Table of Contents
Fetching ...

PID: Physics-Informed Diffusion Model for Infrared Image Generation

Fangyuan Mao, Jilin Mei, Shun Lu, Fuyang Liu, Liang Chen, Fangzhou Zhao, Yu Hu

TL;DR

A Physics-Informed Diffusion model for translating RGB images to infrared images that adhere to physical laws is proposed, which leverages the iterative optimization of the diffusion model and incorporates strong physical constraints based on prior knowledge of infrared laws during training.

Abstract

Infrared imaging technology has gained significant attention for its reliable sensing ability in low visibility conditions, prompting many studies to convert the abundant RGB images to infrared images. However, most existing image translation methods treat infrared images as a stylistic variation, neglecting the underlying physical laws, which limits their practical application. To address these issues, we propose a Physics-Informed Diffusion (PID) model for translating RGB images to infrared images that adhere to physical laws. Our method leverages the iterative optimization of the diffusion model and incorporates strong physical constraints based on prior knowledge of infrared laws during training. This approach enhances the similarity between translated infrared images and the real infrared domain without increasing extra training parameters. Experimental results demonstrate that PID significantly outperforms existing state-of-the-art methods. Our code is available at https://github.com/fangyuanmao/PID.

PID: Physics-Informed Diffusion Model for Infrared Image Generation

TL;DR

A Physics-Informed Diffusion model for translating RGB images to infrared images that adhere to physical laws is proposed, which leverages the iterative optimization of the diffusion model and incorporates strong physical constraints based on prior knowledge of infrared laws during training.

Abstract

Infrared imaging technology has gained significant attention for its reliable sensing ability in low visibility conditions, prompting many studies to convert the abundant RGB images to infrared images. However, most existing image translation methods treat infrared images as a stylistic variation, neglecting the underlying physical laws, which limits their practical application. To address these issues, we propose a Physics-Informed Diffusion (PID) model for translating RGB images to infrared images that adhere to physical laws. Our method leverages the iterative optimization of the diffusion model and incorporates strong physical constraints based on prior knowledge of infrared laws during training. This approach enhances the similarity between translated infrared images and the real infrared domain without increasing extra training parameters. Experimental results demonstrate that PID significantly outperforms existing state-of-the-art methods. Our code is available at https://github.com/fangyuanmao/PID.
Paper Structure (45 sections, 37 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 45 sections, 37 equations, 9 figures, 10 tables, 1 algorithm.

Figures (9)

  • Figure 1: Translated infrared image based on RGB image via different methods. The image generated by EGGAN Lee2023EdgeguidedMR shows physical inaccuracies, such as trees appearing hotter than cars (lighter grey color represents higher temperature). Our result is closer to the GroundTruth and adds details in accordance with physical laws, such as the heat generated by moving tires.
  • Figure 1: Common materials and their emissivities at different wavelengths. $e$ represents emissivity, $\lambda$ represents wavelength. HADAR Bao2023HeatassistedDA collects the emissivities of some of the common materials in the figure. Left: the emissivities of different materials at different wavelengths. Right: the statistical result of the emissivities of different materials. Both figures indicate that emissivity of a special materials changes little at different wavelengths.
  • Figure 2: The infrared signal transmission chain when an infrared camera captures an image of pedestrians. The captured infrared signal primarily includes three components Vollmer2010InfraredTI: thermal radiation emitted by the object $\Phi_{\text{object}}$, thermal radiation reflected from other objects $\Phi_{\text{env}}$, and atmospheric thermal radiation $\Phi_{\text{atm}}$. $\tau$ represents transmissivity of atmosphere while $e$ represents emissivity of detected object.
  • Figure 3: The overview of our proposed PID. (a) Pretraining of $\mathcal{N}_{\text{TeV}}$: the PID trains a $\mathcal{N}_{\text{TeV}}$ with self-supervised loss on infrared dataset. (b) Physics-informed latent diffusion model training: the infrared GT image $\boldsymbol{x}_0$ is encoded by the pretrained encoder $\mathcal{E}$ to obtain the latent space vector $\boldsymbol{z}_0$. Gaussian noise is then added to $\boldsymbol{z}_0$ to produce intermediate vector $\boldsymbol{z}_t$. The denoising UNet is trained to predict the added noise, with the RGB image condition. In addition to $\mathcal{L}_{\text{Noise}}$, $\mathcal{L}_{\text{TeV}}$ and $\mathcal{L}_{\text{Rec}}$ are also incorporated into the training process. During this process, the weights of $\mathcal{E}$, $\mathcal{D}$ and $\mathcal{N}_{\text{TeV}}$ are frozen, allowing PID to learn the infrared features and physical laws without increasing extra training parameters. (c) The inference process of PID.
  • Figure 4: Qualitative results on KAIST dataset. For the sake of clarity, the magnified region is highlighted with boxes for easier comparison. PID demonstrates strong robustness in both daytime and nighttime scenes.
  • ...and 4 more figures