See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
Yuan Wu, Zhiqiang Yan, Yigong Zhang, Xiang Li, Jian Yang
TL;DR
LIAR addresses nighttime occupancy prediction by learning illumination-affined representations. It introduces Selective Low-light Image Enhancement (SLLIE) to globally and selectively enhance poorly lit images, and two illumination-aware modules, 2D-Illumination-guided Sampling (2D-IGS) and 3D-Illumination-driven Projection (3D-IDP), to mitigate underexposure and overexposure in 2D and 3D BEV contexts, respectively. Through Retinex-based enhancement, adaptive sampling, and illumination-weighted BEV refinement, LIAR achieves state-of-the-art results on real and synthetic nighttime datasets, with up to $7.44$ mIoU gains in easy settings and strong gains across severities; ablations confirm the effectiveness of each component. This approach advances robust 3D scene understanding in adverse lighting, with practical impact for safer autonomous driving under night conditions, and provides code and pretrained models for reproducibility.
Abstract
Occupancy prediction aims to estimate the 3D spatial distribution of occupied regions along with their corresponding semantic labels. Existing vision-based methods perform well on daytime benchmarks but struggle in nighttime scenarios due to limited visibility and challenging lighting conditions. To address these challenges, we propose LIAR, a novel framework that learns illumination-affined representations. LIAR first introduces Selective Low-light Image Enhancement (SLLIE), which leverages the illumination priors from daytime scenes to adaptively determine whether a nighttime image is genuinely dark or sufficiently well-lit, enabling more targeted global enhancement. Building on the illumination maps generated by SLLIE, LIAR further incorporates two illumination-aware components: 2D Illumination-guided Sampling (2D-IGS) and 3D Illumination-driven Projection (3D-IDP), to respectively tackle local underexposure and overexposure. Specifically, 2D-IGS modulates feature sampling positions according to illumination maps, assigning larger offsets to darker regions and smaller ones to brighter regions, thereby alleviating feature degradation in underexposed areas. Subsequently,3D-IDP enhances semantic understanding in overexposed regions by constructing illumination intensity fields and supplying refined residual queries to the BEV context refinement process. Extensive experiments on both real and synthetic datasets demonstrate the superior performance of LIAR under challenging nighttime scenarios. The source code and pretrained models are available [here](https://github.com/yanzq95/LIAR).
