Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
Yuwen Pan, Rui Sun, Naisong Luo, Tianzhu Zhang, Yongdong Zhang
TL;DR
This paper tackles night-time semantic segmentation by rejecting the practice of forcing night images into day distributions. It introduces NightFormer, a two-branch architecture with a pixel-level texture enhancement module that leverages Fourier phase information and a hierarchical amplified decoder, plus an object-level reliable matching module that uses learnable prototypes and reliable attention to bridge prototypes and pixels. The method achieves state-of-the-art performance on NightCity, NightCity-fine, CityScapes, and BDD100K-night, demonstrating strong improvements in areas with degraded texture and low contrast. The work advances end-to-end night-specific perception, with practical implications for autonomous driving and night-vision systems, by reducing texture loss and mis-segmentation caused by deceptive low-light cues.
Abstract
Semantic segmentation of night-time images holds significant importance in computer vision, particularly for applications like night environment perception in autonomous driving systems. However, existing methods tend to parse night-time images from a day-time perspective, leaving the inherent challenges in low-light conditions (such as compromised texture and deceiving matching errors) unexplored. To address these issues, we propose a novel end-to-end optimized approach, named NightFormer, tailored for night-time semantic segmentation, avoiding the conventional practice of forcibly fitting night-time images into day-time distributions. Specifically, we design a pixel-level texture enhancement module to acquire texture-aware features hierarchically with phase enhancement and amplified attention, and an object-level reliable matching module to realize accurate association matching via reliable attention in low-light environments. Extensive experimental results on various challenging benchmarks including NightCity, BDD and Cityscapes demonstrate that our proposed method performs favorably against state-of-the-art night-time semantic segmentation methods.
