FD-Vision Mamba for Endoscopic Exposure Correction
Zhuoran Zheng, Jun Zhang
TL;DR
This work tackles exposure correction in endoscopic imaging by introducing FDVM-Net, a frequency-domain network that reconstructs images from phase $P$ and amplitude $A$ through a dual-path architecture built from Convolution-augmented State Space Model blocks (C-SSM) and frequency-domain cross-attention. The method downscales internal representations for the SSM to maintain efficiency, processes phase and amplitude in separate branches, and fuses them before an inverse Fourier transform yields the corrected image, trained with an $L_1$ loss. Extensive experiments on a synthetic E-kvasri dataset and real images show that FDVM-Net achieves state-of-the-art PSNR/SSIM and practical speed-accuracy trade-offs, validating its effectiveness and generalization to arbitrary resolutions. The findings suggest FDVM-Net as a viable backbone for advanced medical image enhancement and potential extensions to other restoration tasks, with code available online.
Abstract
In endoscopic imaging, the recorded images are prone to exposure abnormalities, so maintaining high-quality images is important to assist healthcare professionals in performing decision-making. To overcome this issue, We design a frequency-domain based network, called FD-Vision Mamba (FDVM-Net), which achieves high-quality image exposure correction by reconstructing the frequency domain of endoscopic images. Specifically, inspired by the State Space Sequence Models (SSMs), we develop a C-SSM block that integrates the local feature extraction ability of the convolutional layer with the ability of the SSM to capture long-range dependencies. A two-path network is built using C-SSM as the basic function cell, and these two paths deal with the phase and amplitude information of the image, respectively. Finally, a degraded endoscopic image is reconstructed by FDVM-Net to obtain a high-quality clear image. Extensive experimental results demonstrate that our method achieves state-of-the-art results in terms of speed and accuracy, and it is noteworthy that our method can enhance endoscopic images of arbitrary resolution. The URL of the code is \url{https://github.com/zzr-idam/FDVM-Net}.
