Table of Contents
Fetching ...

Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images

Shufan Pei, Junhong Lin, Wenxi Liu, Tiesong Zhao, Chia-Wen Lin

TL;DR

An Adaptive Multi-scale Fusion network with infrared and visible images, which designs fusion rules according to different illumination regions, and a Detection-guided Semantic Fusion Module (DSFM) to bridge the domain gap between detection and semantic features.

Abstract

In addition to low light, night images suffer degradation from light effects (e.g., glare, floodlight, etc). However, existing nighttime visibility enhancement methods generally focus on low-light regions, which neglects, or even amplifies the light effects. To address this issue, we propose an Adaptive Multi-scale Fusion network (AMFusion) with infrared and visible images, which designs fusion rules according to different illumination regions. First, we separately fuse spatial and semantic features from infrared and visible images, where the former are used for the adjustment of light distribution and the latter are used for the improvement of detection accuracy. Thereby, we obtain an image free of low light and light effects, which improves the performance of nighttime object detection. Second, we utilize detection features extracted by a pre-trained backbone that guide the fusion of semantic features. Hereby, we design a Detection-guided Semantic Fusion Module (DSFM) to bridge the domain gap between detection and semantic features. Third, we propose a new illumination loss to constrain fusion image with normal light intensity. Experimental results demonstrate the superiority of AMFusion with better visual quality and detection accuracy. The source code will be released after the peer review process.

Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images

TL;DR

An Adaptive Multi-scale Fusion network with infrared and visible images, which designs fusion rules according to different illumination regions, and a Detection-guided Semantic Fusion Module (DSFM) to bridge the domain gap between detection and semantic features.

Abstract

In addition to low light, night images suffer degradation from light effects (e.g., glare, floodlight, etc). However, existing nighttime visibility enhancement methods generally focus on low-light regions, which neglects, or even amplifies the light effects. To address this issue, we propose an Adaptive Multi-scale Fusion network (AMFusion) with infrared and visible images, which designs fusion rules according to different illumination regions. First, we separately fuse spatial and semantic features from infrared and visible images, where the former are used for the adjustment of light distribution and the latter are used for the improvement of detection accuracy. Thereby, we obtain an image free of low light and light effects, which improves the performance of nighttime object detection. Second, we utilize detection features extracted by a pre-trained backbone that guide the fusion of semantic features. Hereby, we design a Detection-guided Semantic Fusion Module (DSFM) to bridge the domain gap between detection and semantic features. Third, we propose a new illumination loss to constrain fusion image with normal light intensity. Experimental results demonstrate the superiority of AMFusion with better visual quality and detection accuracy. The source code will be released after the peer review process.
Paper Structure (16 sections, 19 equations, 9 figures, 3 tables)

This paper contains 16 sections, 19 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Our method can better remove the masking effect from high beam. (a) Visible image. (b) Infrared image. (c) Low-light enhancement of (a) by jiang2021enlightengan. (d) Light-effects suppression of (a) by jin2022unsupervised. (e) Multi-modality image fusion result of (a) and (b) by tang2023divfusion. (f) Fusion result of (a) and (b) by our method AMFusion.
  • Figure 2: We introduce infrared image to provide extra information, then utilize a fusion model to achieve our goal. Moreover, we design a new illumination loss for normal light distribution.
  • Figure 3: The architecture of our proposed AMFusion. It consists of Multi-scale Feature Extraction Module (MFEM), Illumination-guided Detail Fusion Module (IDFM), Detection-guided Semantic Fusion Module (DSFM) and Multi-scale Reconstruction Module (MRM). MFEM extracts spatial $F^{sp}_{D/G}$ and semantic features $F^{se}_{D/G}$ with different scales. IDFM integrates spatial features based on the light distribution. DSFM utilizes detection features to guide the fusion of semantic features. The details of IDFM and DSFM will be illustrated in \ref{['fig:network details']}. MRM combines features with all scales to generate fusion image, where $R_{i}$ represents different scale reconstruction module that contains convolution operations and Semantic-Guided Rectify Module (SRM) implements the final adjustment.
  • Figure 4: Intermediate results. Brighter pixel represents higher value. (a) Visible features; (b) Infrared features; (c) Weights of a &b; (d) Fused features.
  • Figure 5: The detailed architecture of IDFM and DSFM.
  • ...and 4 more figures