Table of Contents
Fetching ...

Using KL-Divergence to Focus Frequency Information in Low-Light Image Enhancement

Yan Xingyang, Huang Xiaohong, Zhang Zhao, You Tian, Xu Ziheng

TL;DR

This work tackles low-light image enhancement by shifting from pixel-wise losses to distribution-based fitting in the Fourier domain. The proposed LLFDisc model employs EnhancedLCA in a U-shaped architecture and introduces two key losses: a Fourier-domain KL-Divergence loss $L_{FKL}$ that treats amplitude and phase as Gaussian distributions, and a KL-enhanced VGG perceptual loss $L_{VggKL}$. The combination of these losses with a streamlined single-branch network achieves state-of-the-art performance on LOLv1/v2 and LSRW-Huawei, while maintaining efficiency. Collectively, the approach improves global frequency-consistency and structural fidelity in enhanced images, offering practical benefits for real-world low-light imaging tasks.

Abstract

In the Fourier domain, luminance information is primarily encoded in the amplitude spectrum, while spatial structures are captured in the phase components. The traditional Fourier Frequency information fitting employs pixel-wise loss functions, which tend to focus excessively on local information and may lead to global information loss. In this paper, we present LLFDisc, a U-shaped deep enhancement network that integrates cross-attention and gating mechanisms tailored for frequency-aware enhancement. We propose a novel distribution-aware loss that directly fits the Fourier-domain information and minimizes their divergence using a closed-form KL-Divergence objective. This enables the model to align Fourier-domain information more robustly than with conventional MSE-based losses. Furthermore, we enhance the perceptual loss based on VGG by embedding KL-Divergence on extracted deep features, enabling better structural fidelity. Extensive experiments across multiple benchmarks demonstrate that LLFDisc achieves state-of-the-art performance in both qualitative and quantitative evaluations. Our code will be released at: https://github.com/YanXY000/LLFDisc

Using KL-Divergence to Focus Frequency Information in Low-Light Image Enhancement

TL;DR

This work tackles low-light image enhancement by shifting from pixel-wise losses to distribution-based fitting in the Fourier domain. The proposed LLFDisc model employs EnhancedLCA in a U-shaped architecture and introduces two key losses: a Fourier-domain KL-Divergence loss that treats amplitude and phase as Gaussian distributions, and a KL-enhanced VGG perceptual loss . The combination of these losses with a streamlined single-branch network achieves state-of-the-art performance on LOLv1/v2 and LSRW-Huawei, while maintaining efficiency. Collectively, the approach improves global frequency-consistency and structural fidelity in enhanced images, offering practical benefits for real-world low-light imaging tasks.

Abstract

In the Fourier domain, luminance information is primarily encoded in the amplitude spectrum, while spatial structures are captured in the phase components. The traditional Fourier Frequency information fitting employs pixel-wise loss functions, which tend to focus excessively on local information and may lead to global information loss. In this paper, we present LLFDisc, a U-shaped deep enhancement network that integrates cross-attention and gating mechanisms tailored for frequency-aware enhancement. We propose a novel distribution-aware loss that directly fits the Fourier-domain information and minimizes their divergence using a closed-form KL-Divergence objective. This enables the model to align Fourier-domain information more robustly than with conventional MSE-based losses. Furthermore, we enhance the perceptual loss based on VGG by embedding KL-Divergence on extracted deep features, enabling better structural fidelity. Extensive experiments across multiple benchmarks demonstrate that LLFDisc achieves state-of-the-art performance in both qualitative and quantitative evaluations. Our code will be released at: https://github.com/YanXY000/LLFDisc

Paper Structure

This paper contains 16 sections, 19 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Low-light and normal-light images are Fast Fourier Transformed (FFT) to obtain amplitude and phase information in the frequency domain. By keeping the phase information unchanged and exchanging the amplitude information between the two images, an inverse Fast Fourier Transform (iFFT) is performed to reconstruct the spatial-domain image, which is then visualized to observe the effect.
  • Figure 2: By applying the Fast Fourier Transform (FFT) to images, the amplitude and phase are obtained, and their means ($\mu$) and variances($\sigma$) are calculated. These statistics are used to project the amplitude and phase into Gaussian distributions. Finally, the KL-Divergence between the Gaussian distributions of the predicted and ground-truth images' amplitude and phase is computed and returned as the loss value.
  • Figure 3: The LLFDisc network is illustrated in the figure, with the upper part depicting the overall structure and the lower part detailing the individual modules. The LLFDisc network enhances low-light images by employing a series of convolutional, activation, downsampling, and upsampling operations, complemented by the Enhanced Lighten Cross Attention (EnhancedLCA) module. These components work synergistically to improve the brightness and detail of low-light images.
  • Figure 4: Enhanced visualization comparison images generated by different state-of-the-art (SOTA) methods on LOLv2-Real (top row) and LOLv2-Synthetic (bottom row).
  • Figure 5: Qualitative comparison results on the LIME, VV, DICM, NPE, and MEF datasets. We selected one image from each dataset to compare our method with other models.
  • ...and 2 more figures