AIM2PC: Aerial Image to 3D Building Point Cloud Reconstruction
Soulaimene Turki, Daniel Panangian, Houda Chaabouni-Chouayakh, Ksenia Bittner
TL;DR
AIM2PC tackles the challenge of reconstructing complete 3D building point clouds from a single aerial image, addressing the limitations of rooftop-only reconstructions and the scarcity of pose-equipped datasets. It introduces an edge-enhanced, diffusion-based framework conditioned on concatenated image features, a binary building mask, and Sobel edge maps, implemented via a Centered Denoising Diffusion Probabilistic Model to fuse 2D cues into a fully 3D building representation. A new dataset providing complete 3D point clouds and corresponding camera poses enables training and fair benchmarking. Quantitative results show notable improvements in F-Score and Chamfer Distance over baselines PC² and CCD-3DR, with qualitative evidence of sharper edges and more complete geometry. This approach offers a scalable, cost-effective path for urban 3D reconstruction from single-view aerial imagery and establishes a resource for future comparisons.
Abstract
Three-dimensional urban reconstruction of buildings from single-view images has attracted significant attention over the past two decades. However, recent methods primarily focus on rooftops from aerial images, often overlooking essential geometrical details. Additionally, there is a notable lack of datasets containing complete 3D point clouds for entire buildings, along with challenges in obtaining reliable camera pose information for aerial images. This paper addresses these challenges by presenting a novel methodology, AIM2PC , which utilizes our generated dataset that includes complete 3D point clouds and determined camera poses. Our approach takes features from a single aerial image as input and concatenates them with essential additional conditions, such as binary masks and Sobel edge maps, to enable more edge-aware reconstruction. By incorporating a point cloud diffusion model based on Centered denoising Diffusion Probabilistic Models (CDPM), we project these concatenated features onto the partially denoised point cloud using our camera poses at each diffusion step. The proposed method is able to reconstruct the complete 3D building point cloud, including wall information and demonstrates superior performance compared to existing baseline techniques. To allow further comparisons with our methodology the dataset has been made available at https://github.com/Soulaimene/AIM2PCDataset
