From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis

Ranran Huang; Weixun Luo; Ye Mao; Krystian Mikolajczyk

From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis

Ranran Huang, Weixun Luo, Ye Mao, Krystian Mikolajczyk

Abstract

In this paper, we introduce NAS3R, a self-supervised feed-forward framework that jointly learns explicit 3D geometry and camera parameters with no ground-truth annotations and no pretrained priors. During training, NAS3R reconstructs 3D Gaussians from uncalibrated and unposed context views and renders target views using its self-predicted camera parameters, enabling self-supervised training from 2D photometric supervision. To ensure stable convergence, NAS3R integrates reconstruction and camera prediction within a shared transformer backbone regulated by masked attention, and adopts a depth-based Gaussian formulation that facilitates well-conditioned optimization. The framework is compatible with state-of-the-art supervised 3D reconstruction architectures and can incorporate pretrained priors or intrinsic information when available. Extensive experiments show that NAS3R achieves superior results to other self-supervised methods, establishing a scalable and geometry-aware paradigm for 3D reconstruction from unconstrained data. Code and models are publicly available at https://ranrhuang.github.io/nas3r/.

From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis

Abstract

From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis

Abstract

Paper Structure

Table of Contents

Figures (10)