DEFormer: DCT-driven Enhancement Transformer for Low-light Image and Dark Vision
Xiangchen Yin, Zhenda Yu, Xin Gao, Xiao Sun
TL;DR
DEFormer addresses low-light image enhancement by introducing a frequency-guided transformer framework. It introduces a Learnable Frequency Branch (LFB) that integrates DCT-based frequency cues and curvature-based frequency enhancement, along with Cross Domain Fusion (CDF) to align RGB features with frequency information. The approach yields state-of-the-art results on the LOL and MIT-Adobe FiveK datasets and improves downstream dark-object detection on ExDark when used in end-to-end detector training. The combination of frequency-domain cues with a transformer backbone provides improved texture recovery in dark regions at a realistic computational cost.
Abstract
Low-light image enhancement restores the colors and details of a single image and improves high-level visual tasks. However, restoring the lost details in the dark area is still a challenge relying only on the RGB domain. In this paper, we delve into frequency as a new clue into the model and propose a DCT-driven enhancement transformer (DEFormer) framework. First, we propose a learnable frequency branch (LFB) for frequency enhancement contains DCT processing and curvature-based frequency enhancement (CFE) to represent frequency features. Additionally, we propose a cross domain fusion (CDF) to reduce the differences between the RGB domain and the frequency domain. Our DEFormer has achieved superior results on the LOL and MIT-Adobe FiveK datasets, improving the dark detection performance.
