Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation
Tianfu Wang, Mingyang Xie, Haoming Cai, Sachin Shah, Christopher A. Metzler
TL;DR
Glass and other transparent surfaces create reflections that degrade images. Flash-Split introduces a two-stage latent-diffusion framework that uses misaligned flash/no-flash cues to separate transmission and reflection in latent space, mitigating alignment sensitivity. Stage 1 performs recursive latent separation with a dual-branch diffusion network conditioned on a flash/no-flash latent pair, while Stage 2 uses cross-latent decoding guided by the original input to recover faithful, high-frequency details. Evaluations on real-world scenes show state-of-the-art performance and robustness to misalignment, including scenarios without RAW input, highlighting practical applicability.
Abstract
Transparent surfaces, such as glass, create complex reflections that obscure images and challenge downstream computer vision applications. We introduce Flash-Split, a robust framework for separating transmitted and reflected light using a single (potentially misaligned) pair of flash/no-flash images. Our core idea is to perform latent-space reflection separation while leveraging the flash cues. Specifically, Flash-Split consists of two stages. Stage 1 separates apart the reflection latent and transmission latent via a dual-branch diffusion model conditioned on an encoded flash/no-flash latent pair, effectively mitigating the flash/no-flash misalignment issue. Stage 2 restores high-resolution, faithful details to the separated latents, via a cross-latent decoding process conditioned on the original images before separation. By validating Flash-Split on challenging real-world scenes, we demonstrate state-of-the-art reflection separation performance and significantly outperform the baseline methods.
