Table of Contents
Fetching ...

VIIS: Visible and Infrared Information Synthesis for Severe Low-light Image Enhancement

Chen Zhao, Mengyuan Yu, Fan Yang, Peiguang Jing

TL;DR

VIIS addresses severe low-light information loss by jointly enhancing visible content and colorizing infrared information through a diffusion-model framework. It introduces a sparse attention-based dual-modalities residual (SADMR) conditioning mechanism and an information synthesis pretext task (ISPT) to enable both inter-modal complementation and intra-modal enhancement without ground-truth data. Across MSRS and KAIST-MS, VIIS outperforms state-of-the-art low-light enhancement, infrared colorization, and fusion methods, as well as newly constructed baselines that combine these components. The approach has practical implications for robust night-time imaging in surveillance and wildlife monitoring, where infrared information can reveal critical situational details.

Abstract

Images captured in severe low-light circumstances often suffer from significant information absence. Existing singular modality image enhancement methods struggle to restore image regions lacking valid information. By leveraging light-impervious infrared images, visible and infrared image fusion methods have the potential to reveal information hidden in darkness. However, they primarily emphasize inter-modal complementation but neglect intra-modal enhancement, limiting the perceptual quality of output images. To address these limitations, we propose a novel task, dubbed visible and infrared information synthesis (VIIS), which aims to achieve both information enhancement and fusion of the two modalities. Given the difficulty in obtaining ground truth in the VIIS task, we design an information synthesis pretext task (ISPT) based on image augmentation. We employ a diffusion model as the framework and design a sparse attention-based dual-modalities residual (SADMR) conditioning mechanism to enhance information interaction between the two modalities. This mechanism enables features with prior knowledge from both modalities to adaptively and iteratively attend to each modality's information during the denoising process. Our extensive experiments demonstrate that our model qualitatively and quantitatively outperforms not only the state-of-the-art methods in relevant fields but also the newly designed baselines capable of both information enhancement and fusion. The code is available at https://github.com/Chenz418/VIIS.

VIIS: Visible and Infrared Information Synthesis for Severe Low-light Image Enhancement

TL;DR

VIIS addresses severe low-light information loss by jointly enhancing visible content and colorizing infrared information through a diffusion-model framework. It introduces a sparse attention-based dual-modalities residual (SADMR) conditioning mechanism and an information synthesis pretext task (ISPT) to enable both inter-modal complementation and intra-modal enhancement without ground-truth data. Across MSRS and KAIST-MS, VIIS outperforms state-of-the-art low-light enhancement, infrared colorization, and fusion methods, as well as newly constructed baselines that combine these components. The approach has practical implications for robust night-time imaging in surveillance and wildlife monitoring, where infrared information can reveal critical situational details.

Abstract

Images captured in severe low-light circumstances often suffer from significant information absence. Existing singular modality image enhancement methods struggle to restore image regions lacking valid information. By leveraging light-impervious infrared images, visible and infrared image fusion methods have the potential to reveal information hidden in darkness. However, they primarily emphasize inter-modal complementation but neglect intra-modal enhancement, limiting the perceptual quality of output images. To address these limitations, we propose a novel task, dubbed visible and infrared information synthesis (VIIS), which aims to achieve both information enhancement and fusion of the two modalities. Given the difficulty in obtaining ground truth in the VIIS task, we design an information synthesis pretext task (ISPT) based on image augmentation. We employ a diffusion model as the framework and design a sparse attention-based dual-modalities residual (SADMR) conditioning mechanism to enhance information interaction between the two modalities. This mechanism enables features with prior knowledge from both modalities to adaptively and iteratively attend to each modality's information during the denoising process. Our extensive experiments demonstrate that our model qualitatively and quantitatively outperforms not only the state-of-the-art methods in relevant fields but also the newly designed baselines capable of both information enhancement and fusion. The code is available at https://github.com/Chenz418/VIIS.

Paper Structure

This paper contains 22 sections, 7 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: An existing low-light enhancement method EnlightenGAN jiang2021enlightengan fails to effectively enhance the image background where most information is obliterated by darkness. The visible and infrared image fusion method CDDFuse Zhao_2023_CVPR reveals the outlines of people and buildings, however, the overall image remains low-luminosity and the region complemented by the infrared image remains sketchy.
  • Figure 2: The overall of our model. The information synthesis pretext task (ISPT) initially generates the pseudo-low-light visible images with a data augmentation strategy. Subsequently, in one branch, the latent space infrared and visible images are concatenated with noise. In the other branch, these latent space images are encoded to obtain multi-scale features, which are then injected into the Unet-based denoising network through sparse cross-attention.
  • Figure 3: Qualitative comparison of '01012N' from MSRS dataset.
  • Figure 4: Qualitative comparison of '01198N' from MSRS dataset.
  • Figure 5: Qualitative comparison of 'set09_v000_00330' from KAIST-MS dataset.
  • ...and 8 more figures