DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement
Handing Xu, Zhenguo Nie, Tairan Peng, Huimin Pan, Xin-Jun Liu
TL;DR
This work tackles the challenge of real-time endoscopic video enhancement by introducing a degradation-guided framework that explicitly models and propagates degradation representations across frames. It combines a degradation-aware module (DAM), a degradation-guided enhancement module (DGEM), and a degradation representation propagation module (DRPM) with cycle-consistency training to achieve high-quality restoration while maintaining real-time performance. Key contributions include a two-stage training strategy that leverages artificial degradations for pretraining and real unpaired data for adaptation, and a temporal degradation propagation mechanism that reduces computation without sacrificing coherence. The approach demonstrates strong performance on both synthetic degradations (SCARED) and real surgical data (SES), highlighting the practicality of degradation-aware modeling for clinical endoscopic video enhancement.
Abstract
Endoscopic surgery relies on intraoperative video, making image quality a decisive factor for surgical safety and efficacy. Yet, endoscopic videos are often degraded by uneven illumination, tissue scattering, occlusions, and motion blur, which obscure critical anatomical details and complicate surgical manipulation. Although deep learning-based methods have shown promise in image enhancement, most existing approaches remain too computationally demanding for real-time surgical use. To address this challenge, we propose a degradation-aware framework for endoscopic video enhancement, which enables real-time, high-quality enhancement by propagating degradation representations across frames. In our framework, degradation representations are first extracted from images using contrastive learning. We then introduce a fusion mechanism that modulates image features with these representations to guide a single-frame enhancement model, which is trained with a cycle-consistency constraint between degraded and restored images to improve robustness and generalization. Experiments demonstrate that our framework achieves a superior balance between performance and efficiency compared with several state-of-the-art methods. These results highlight the effectiveness of degradation-aware modeling for real-time endoscopic video enhancement. Nevertheless, our method suggests that implicitly learning and propagating degradation representation offer a practical pathway for clinical application.
