Adaptive Language-Aware Image Reflection Removal Network

Siyan Fang; Yuntao Wang; Jinpu Zhang; Ziwen Li; Yuehuan Wang

Adaptive Language-Aware Image Reflection Removal Network

Siyan Fang, Yuntao Wang, Jinpu Zhang, Ziwen Li, Yuehuan Wang

TL;DR

The Adaptive Language-Aware Network (ALANet) is proposed to remove reflections even with inaccurate language inputs, and demonstrates that ALANet surpasses state-of-the-art methods for image reflection removal.

Abstract

Existing image reflection removal methods struggle to handle complex reflections. Accurate language descriptions can help the model understand the image content to remove complex reflections. However, due to blurred and distorted interferences in reflected images, machine-generated language descriptions of the image content are often inaccurate, which harms the performance of language-guided reflection removal. To address this, we propose the Adaptive Language-Aware Network (ALANet) to remove reflections even with inaccurate language inputs. Specifically, ALANet integrates both filtering and optimization strategies. The filtering strategy reduces the negative effects of language while preserving its benefits, whereas the optimization strategy enhances the alignment between language and visual features. ALANet also utilizes language cues to decouple specific layer content from feature maps, improving its ability to handle complex reflections. To evaluate the model's performance under complex reflections and varying levels of language accuracy, we introduce the Complex Reflection and Language Accuracy Variance (CRLAV) dataset. Experimental results demonstrate that ALANet surpasses state-of-the-art methods for image reflection removal. The code and dataset are available at https://github.com/fashyon/ALANet.

Adaptive Language-Aware Image Reflection Removal Network

TL;DR

Abstract

Paper Structure (28 sections, 1 equation, 23 figures, 10 tables)

This paper contains 28 sections, 1 equation, 23 figures, 10 tables.

Introduction
Related Work
Image Reflection Removal
Applications of Language in Image Processing
Proposed Method
Adaptive Language-Aware Network
Language-Aware Competition Attention Module
Adaptive Language Calibration Module
Language-Guided Spatial-Channel Cross Transformer
Complex Reflection and Language Accuracy Variance Dataset
Experiments
Implementation Details
Dataset and Evaluation Metrics
Comparison Results
Ablation Studies
...and 13 more sections

Figures (23)

Figure 1: The impact of language-guided reflection removal with different types of language inputs. Inaccurate language inputs result in worse outcomes than having no language. The specific language inputs for each subfigure are provided in the supplementary material.
Figure 2: Overview of the proposed ALANet, which comprises various modules that use language adaptively to remove reflections. T and R represent the transmission and reflection layers, respectively.
Figure 3: Structure of the LASB. As the core of LASB, LCAM utilizes language features from different layers to facilitate the separation of those layers.
Figure 4: Structure of the LCAM. The dashed lines indicate the scenario without language input.
Figure 5: Structure of the ALCM. The ALCM enhances the consistency between language features and visual content.
...and 18 more figures

Adaptive Language-Aware Image Reflection Removal Network

TL;DR

Abstract

Adaptive Language-Aware Image Reflection Removal Network

Authors

TL;DR

Abstract

Table of Contents

Figures (23)