Table of Contents
Fetching ...

Fourier Boundary Features Network with Wider Catchers for Glass Segmentation

Xiaolin Qin, Jiacen Liu, Qianlei Wang, Shaolin Zhang, Fei Zhu, Zhang Yi

TL;DR

The paper tackles the challenging task of glass segmentation, where boundary delineation is confounded by transmission and reflection. It introduces FBWC, a shallow-wide network that uses Wider Coarse-Catchers to anchor large glass areas, Cross Transpose Attention to refine fine-grained boundaries, and a Learnable Fourier Convolution Controller to fuse heterogeneous features with FFT-based frequency cues. Across three public datasets, FBWC achieves state-of-the-art performance, with ablations showing four CUs as optimal, the critical role of boundary-aware losses, and the effectiveness of CTA and FCC in mitigating reflection noise. The work demonstrates robustness to night and non-ideal conditions while maintaining reasonable model complexity, and it points to future work on more diverse boundary data and semi-supervised learning.

Abstract

Glass largely blurs the boundary between the real world and the reflection. The special transmittance and reflectance quality have confused the semantic tasks related to machine vision. Therefore, how to clear the boundary built by glass, and avoid over-capturing features as false positive information in deep structure, matters for constraining the segmentation of reflection surface and penetrating glass. We proposed the Fourier Boundary Features Network with Wider Catchers (FBWC), which might be the first attempt to utilize sufficiently wide horizontal shallow branches without vertical deepening for guiding the fine granularity segmentation boundary through primary glass semantic information. Specifically, we designed the Wider Coarse-Catchers (WCC) for anchoring large area segmentation and reducing excessive extraction from a structural perspective. We embed fine-grained features by Cross Transpose Attention (CTA), which is introduced to avoid the incomplete area within the boundary caused by reflection noise. For excavating glass features and balancing high-low layers context, a learnable Fourier Convolution Controller (FCC) is proposed to regulate information integration robustly. The proposed method has been validated on three different public glass segmentation datasets. Experimental results reveal that the proposed method yields better segmentation performance compared with the state-of-the-art (SOTA) methods in glass image segmentation.

Fourier Boundary Features Network with Wider Catchers for Glass Segmentation

TL;DR

The paper tackles the challenging task of glass segmentation, where boundary delineation is confounded by transmission and reflection. It introduces FBWC, a shallow-wide network that uses Wider Coarse-Catchers to anchor large glass areas, Cross Transpose Attention to refine fine-grained boundaries, and a Learnable Fourier Convolution Controller to fuse heterogeneous features with FFT-based frequency cues. Across three public datasets, FBWC achieves state-of-the-art performance, with ablations showing four CUs as optimal, the critical role of boundary-aware losses, and the effectiveness of CTA and FCC in mitigating reflection noise. The work demonstrates robustness to night and non-ideal conditions while maintaining reasonable model complexity, and it points to future work on more diverse boundary data and semi-supervised learning.

Abstract

Glass largely blurs the boundary between the real world and the reflection. The special transmittance and reflectance quality have confused the semantic tasks related to machine vision. Therefore, how to clear the boundary built by glass, and avoid over-capturing features as false positive information in deep structure, matters for constraining the segmentation of reflection surface and penetrating glass. We proposed the Fourier Boundary Features Network with Wider Catchers (FBWC), which might be the first attempt to utilize sufficiently wide horizontal shallow branches without vertical deepening for guiding the fine granularity segmentation boundary through primary glass semantic information. Specifically, we designed the Wider Coarse-Catchers (WCC) for anchoring large area segmentation and reducing excessive extraction from a structural perspective. We embed fine-grained features by Cross Transpose Attention (CTA), which is introduced to avoid the incomplete area within the boundary caused by reflection noise. For excavating glass features and balancing high-low layers context, a learnable Fourier Convolution Controller (FCC) is proposed to regulate information integration robustly. The proposed method has been validated on three different public glass segmentation datasets. Experimental results reveal that the proposed method yields better segmentation performance compared with the state-of-the-art (SOTA) methods in glass image segmentation.
Paper Structure (22 sections, 43 equations, 14 figures, 8 tables)

This paper contains 22 sections, 43 equations, 14 figures, 8 tables.

Figures (14)

  • Figure 1: Visual examples of glass segmentation. Compared with the SOTA methods, it can be concluded that the proposed network can accurately locate the segmentation boundary through a shallow feature capture framework and boundary constraints, and at the same time, the segmentation region is restricted within the boundary to avoid over-capturing and maintain regional consistency.
  • Figure 2: Architecture of the proposed FBWC. Given an input glass image, there will be two branches to receive the same picture and change them into different features. In particular, one of features is fed into WCC to alleviate the over-capturing and boundary constraints of scarcity. The others is processed by the CTA to supplement and focus fine-grained information. The designed FCC as a learnable feature fusion regulator to balance heterogeneous inputs. Finally, a fine-grained weighting factor regulating the shallow and large area segmentation result with supplementing boundary information constraints to achieve the purpose of dynamic optimized network predictions. The number next to each module denotes its parameter. Express model concerns more intuitively by exporting Hot Map.
  • Figure 3: Details of Capturing Unit on WCC, which is designed to avoid over-capturing and provide boundary constraint.
  • Figure 4: Illustration of Cross Transpose Attention (CTA) blocks.
  • Figure 5: Illustration of FCC, which serves as the core component of FBWC for providing a flexible solution to the features fusion problem. Express model concerns more intuitively by exporting Hot Map.
  • ...and 9 more figures