Re-coding for Uncertainties: Edge-awareness Semantic Concordance for Resilient Event-RGB Segmentation
Nan Bao, Yifan Zhao, Lin Zhu, Jia Li
TL;DR
This paper tackles semantic segmentation when RGB data are corrupted by extreme conditions by fusing event data with RGB through a shared edge-based latent space. It introduces Edge-awareness Semantic Concordance (ESC), composed of Edge-awareness Latent Re-coding (ELR), Re-coded Consolidation (RC), and Uncertainty Optimization (UO), guided by a pre-trained edge dictionary learned via VQ-VAE. The approach realigns heterogeneous modalities into a unified edge-informed semantic space, consolidates edge cues, and optimizes fusion under per-pixel uncertainties, achieving state-of-the-art results on synthetic and real extreme-condition datasets and demonstrating strong resilience to occlusion. The work provides public code and novel datasets (DERS-XS, DERS-XR, DSEC-Xtrm) to benchmark event-RGB segmentation under challenging scenarios, with potential impact on robust perception for autonomous systems.
Abstract
Semantic segmentation has achieved great success in ideal conditions. However, when facing extreme conditions (e.g., insufficient light, fierce camera motion), most existing methods suffer from significant information loss of RGB, severely damaging segmentation results. Several researches exploit the high-speed and high-dynamic event modality as a complement, but event and RGB are naturally heterogeneous, which leads to feature-level mismatch and inferior optimization of existing multi-modality methods. Different from these researches, we delve into the edge secret of both modalities for resilient fusion and propose a novel Edge-awareness Semantic Concordance framework to unify the multi-modality heterogeneous features with latent edge cues. In this framework, we first propose Edge-awareness Latent Re-coding, which obtains uncertainty indicators while realigning event-RGB features into unified semantic space guided by re-coded distribution, and transfers event-RGB distributions into re-coded features by utilizing a pre-established edge dictionary as clues. We then propose Re-coded Consolidation and Uncertainty Optimization, which utilize re-coded edge features and uncertainty indicators to solve the heterogeneous event-RGB fusion issues under extreme conditions. We establish two synthetic and one real-world event-RGB semantic segmentation datasets for extreme scenario comparisons. Experimental results show that our method outperforms the state-of-the-art by a 2.55% mIoU on our proposed DERS-XS, and possesses superior resilience under spatial occlusion. Our code and datasets are publicly available at https://github.com/iCVTEAM/ESC.
