DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination
Mingyang Ou, Haojin Li, Yifeng Zhang, Ke Niu, Zhongxi Qiu, Heng Li, Jiang Liu
TL;DR
Endoscopic depth estimation suffers from severe depth errors in uneven illumination. DeLightMono introduces Illumination-Reflectance-Depth (IRD) modeling and a joint self-supervised framework with auxiliary networks to decouple illumination and reflectance from depth, guided by four losses including a reconstruction, a ratio-based regularization, a degradation-consistency term, and a reflectance-guided photometric loss. The method yields state-of-the-art results on SCARED and strong generalization to Hamlyn, notably improving depth accuracy in low-light regions and robustness to lighting changes. This illumination-aware approach offers practical benefits for endoscopic navigation by delivering more reliable depth in challenging lighting conditions without requiring labeled data or external lighting resources.
Abstract
Self-supervised monocular depth estimation serves as a key task in the development of endoscopic navigation systems. However, performance degradation persists due to uneven illumination inherent in endoscopic images, particularly in low-intensity regions. Existing low-light enhancement techniques fail to effectively guide the depth network. Furthermore, solutions from other fields, like autonomous driving, require well-lit images, making them unsuitable and increasing data collection burdens. To this end, we present DeLight-Mono - a novel self-supervised monocular depth estimation framework with illumination decoupling. Specifically, endoscopic images are represented by a designed illumination-reflectance-depth model, and are decomposed with auxiliary networks. Moreover, a self-supervised joint-optimizing framework with novel losses leveraging the decoupled components is proposed to mitigate the effects of uneven illumination on depth estimation. The effectiveness of the proposed methods was rigorously verified through extensive comparisons and an ablation study performed on two public datasets.
