Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction

Mingyang Li; Yimeng Fan; Changsong Liu; Lixue Xu; Xin Wang; Yanyan Liu; Wei Zhang

Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction

Mingyang Li, Yimeng Fan, Changsong Liu, Lixue Xu, Xin Wang, Yanyan Liu, Wei Zhang

TL;DR

IPDRecon introduces an image-plane decoding framework that exploits intra-view geometric priors to reduce reliance on multi-view back-projections for indoor scene reconstruction. By integrating Pixel-level Confidence Encoder, Affine Compensation Module, and Image-Plane Spatial Decoder, the method decodes distance, position, and affine-invariant geometric features from single views and fuses them with multi-view constraints via a state-space, geometry-aware cost volume. In experiments on ScanNet V2 and cross-domain tests, IPDRecon achieves superior stability under sparse views (CV = 0.24%, PRR = 99.7%, Max Drop = 0.42%) and high 3D reconstruction metrics (Precision 0.797, F-score 0.722). The work demonstrates that leveraging intra-view geometric information can substantially improve view-invariant indoor reconstruction, enabling robust performance in view-limited practical scenarios.

Abstract

Volume-based indoor scene reconstruction methods offer superior generalization capability and real-time deployment potential. However, existing methods rely on multi-view pixel back-projection ray intersections as weak geometric constraints to determine spatial positions. This dependence results in reconstruction quality being heavily influenced by input view density. Performance degrades in overlapping regions and unobserved areas.To address these limitations, we reduce dependency on inter-view geometric constraints by exploiting spatial information within individual views. We propose an image-plane decoding framework with three core components: Pixel-level Confidence Encoder, Affine Compensation Module, and Image-Plane Spatial Decoder. These modules decode three-dimensional structural information encoded in images through physical imaging processes. The framework effectively preserves spatial geometric features including edges, hollow structures, and complex textures. It significantly enhances view-invariant reconstruction.Experiments on indoor scene reconstruction datasets confirm superior reconstruction stability. Our method maintains nearly identical quality when view count reduces by 40%. It achieves a coefficient of variation of 0.24%, performance retention rate of 99.7%, and maximum performance drop of 0.42%. These results demonstrate that exploiting intra-view spatial information provides a robust solution for view-limited scenarios in practical applications.

Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction

TL;DR

Abstract

Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)