Language-guided Image Reflection Separation
Haofeng Zhong, Yuchen Hong, Shuchen Weng, Jinxiu Liang, Boxin Shi
TL;DR
This work tackles the ill-posed problem of reflection separation by introducing language guidance to provide semantic priors. It presents a cross-modal framework with adaptive global aggregation (AGAM) and interaction (AGIM) modules, gated language guidance, and a randomized training strategy to handle recognizable layer ambiguity, supported by contrastive and layer-correspondence losses. A dataset with synthetic and real image-language pairs is built to train and evaluate the approach, including a new RefOL real-world set. Empirical results on real data show state-of-the-art PSNR/SSIM and improved qualitative separation, demonstrating the practical potential of language-guided reflection separation for single-image scenarios.
Abstract
This paper studies the problem of language-guided reflection separation, which aims at addressing the ill-posed reflection separation problem by introducing language descriptions to provide layer content. We propose a unified framework to solve this problem, which leverages the cross-attention mechanism with contrastive learning strategies to construct the correspondence between language descriptions and image layers. A gated network design and a randomized training strategy are employed to tackle the recognizable layer ambiguity. The effectiveness of the proposed method is validated by the significant performance advantage over existing reflection separation methods on both quantitative and qualitative comparisons.
