GaussianLens: Localized High-Resolution Reconstruction via On-Demand Gaussian Densification

Yijia Weng; Zhicheng Wang; Songyou Peng; Saining Xie; Howard Zhou; Leonidas J. Guibas

GaussianLens: Localized High-Resolution Reconstruction via On-Demand Gaussian Densification

Yijia Weng, Zhicheng Wang, Songyou Peng, Saining Xie, Howard Zhou, Leonidas J. Guibas

TL;DR

This work tackles the challenge of reconstructing high-frequency details only where needed by formulating localized high-resolution reconstruction via on-demand Gaussian densification. The authors introduce GaussianLens, a cross-modal framework that densifies an initial low-resolution 3D Gaussian Splatting (3DGS) reconstruction within a user-specified RoI by fusing multi-view images with Gaussian features through a PointTransformer-based encoder and projection-based cross-attention, producing densified Gaussians as residuals. To handle substantial resolution increases, they add a pixel-guided densification pathway that spawns a Gaussian per RoI pixel, enabling faithful preservation of fine details. They validate on RealEstate10K and DL3DV, showing improved RoI detail, strong generalization to unseen Gaussian sources, and favorable efficiency compared with full high-resolution baselines, supported by ablations and a dedicated RoI view-synthesis benchmark.

Abstract

We perceive our surroundings with an active focus, paying more attention to regions of interest, such as the shelf labels in a grocery store. When it comes to scene reconstruction, this human perception trait calls for spatially varying degrees of detail ready for closer inspection in critical regions, preferably reconstructed on demand. While recent works in 3D Gaussian Splatting (3DGS) achieve fast, generalizable reconstruction from sparse views, their uniform resolution output leads to high computational costs unscalable to high-resolution training. As a result, they cannot leverage available images at their original high resolution to reconstruct details. Per-scene optimization methods reconstruct finer details with adaptive density control, yet require dense observations and lengthy offline optimization. To bridge the gap between the prohibitive cost of high-resolution holistic reconstructions and the user needs for localized fine details, we propose the problem of localized high-resolution reconstruction via on-demand Gaussian densification. Given a low-resolution 3DGS reconstruction, the goal is to learn a generalizable network that densifies the initial 3DGS to capture fine details in a user-specified local region of interest (RoI), based on sparse high-resolution observations of the RoI. This formulation avoids the high cost and redundancy of uniformly high-resolution reconstructions and fully leverages high-resolution captures in critical regions. We propose GaussianLens, a feed-forward densification framework that fuses multi-modal information from the initial 3DGS and multi-view images. We further design a pixel-guided densification mechanism that effectively captures details under large resolution increases. Experiments demonstrate our method's superior performance in local fine detail reconstruction and strong scalability to images of up to $1024\times1024$ resolution.

GaussianLens: Localized High-Resolution Reconstruction via On-Demand Gaussian Densification

TL;DR

Abstract

GaussianLens: Localized High-Resolution Reconstruction via On-Demand Gaussian Densification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)