SplatBright: Generalizable Low-Light Scene Reconstruction from Sparse Views via Physically-Guided Gaussian Enhancement
Yue Wen, Liang Song, Hesheng Wang
TL;DR
This work tackles the challenge of reconstructing coherent 3D scenes from sparse, low-light views by marrying a generalizable 3D Gaussian representation with physically grounded illumination modeling. It introduces SplatBright, a dual-head geometry–appearance predictor, an Illumination Consistency Module with frequency-guided cross attention, and an Appearance Refinement Module for local texture refinement, all trained via synthetic dark–normal data. The approach achieves superior novel-view synthesis, cross-view consistency, and cross-domain generalization to unseen low-light scenes, while enabling controllable relighting. The results demonstrate clear gains over 2D enhancement and 3D reconstruction baselines, with positive impact on downstream perception tasks and perceptual quality metrics.
Abstract
Low-light 3D reconstruction from sparse views remains challenging due to exposure imbalance and degraded color fidelity. While existing methods struggle with view inconsistency and require per-scene training, we propose SplatBright, which is, to our knowledge, the first generalizable 3D Gaussian framework for joint low-light enhancement and reconstruction from sparse sRGB inputs. Our key idea is to integrate physically guided illumination modeling with geometry-appearance decoupling for consistent low-light reconstruction. Specifically, we adopt a dual-branch predictor that provides stable geometric initialization of 3D Gaussian parameters. On the appearance side, illumination consistency leverages frequency priors to enable controllable and cross-view coherent lighting, while an appearance refinement module further separates illumination, material, and view-dependent cues to recover fine texture. To tackle the lack of large-scale geometrically consistent paired data, we synthesize dark views via a physics-based camera model for training. Extensive experiments on public and self-collected datasets demonstrate that SplatBright achieves superior novel view synthesis, cross-view consistency, and better generalization to unseen low-light scenes compared with both 2D and 3D methods.
