AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

Yuze He; Wang Zhao; Shaohui Liu; Yubin Hu; Yushi Bai; Yu-Hui Wen; Yong-Jin Liu

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

Yuze He, Wang Zhao, Shaohui Liu, Yubin Hu, Yushi Bai, Yu-Hui Wen, Yong-Jin Liu

TL;DR

The paper tackles the challenge of reconstructing complete, accurate 3D planar surfaces from monocular video. It introduces AlphaTablets, a generic 3D plane representation that encodes planes as rectangles with learnable alpha channels to capture both solid surfaces and irregular boundaries, coupled with differentiable rasterization for image formation. A bottom-up pipeline initializes many small AlphaTablets from 2D superpixels and monocular cues, then jointly optimizes geometry, texture, and alpha via rendering-based losses and a hierarchical merging scheme to form larger planes. Extensive experiments on ScanNet show state-of-the-art 3D planar reconstruction and meaningful plane-based scene editing, highlighting AlphaTablets’ potential as a versatile 3D plane representation for downstream tasks in vision and graphics.

Abstract

We introduce AlphaTablets, a novel and generic representation of 3D planes that features continuous 3D surface and precise boundary delineation. By representing 3D planes as rectangles with alpha channels, AlphaTablets combine the advantages of current 2D and 3D plane representations, enabling accurate, consistent and flexible modeling of 3D planes. We derive differentiable rasterization on top of AlphaTablets to efficiently render 3D planes into images, and propose a novel bottom-up pipeline for 3D planar reconstruction from monocular videos. Starting with 2D superpixels and geometric cues from pre-trained models, we initialize 3D planes as AlphaTablets and optimize them via differentiable rendering. An effective merging scheme is introduced to facilitate the growth and refinement of AlphaTablets. Through iterative optimization and merging, we reconstruct complete and accurate 3D planes with solid surfaces and clear boundaries. Extensive experiments on the ScanNet dataset demonstrate state-of-the-art performance in 3D planar reconstruction, underscoring the great potential of AlphaTablets as a generic 3D plane representation for various applications. Project page is available at: https://hyzcluster.github.io/alphatablets

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

TL;DR

Abstract

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)